Details
-
Improvement
-
Status: In Progress
-
Major
-
Resolution: Unresolved
-
0.1-incubating, 0.2-incubating
-
*nix
Description
When I was testing the dataset_processor module I noticed that most tests would complete in less than 1 second. Then I came across the "test_daily_to_monthly_rebin" test and it would take over 2 minutes to complete.
The test initially used a 1x1 degree grid covering the globe and daily time step for 2 years (730 days).
I ran some initial checks and the lag appears to be down in the code where the data is rebinned down in '_rcmes_calc_average_on_new_time_unit_K'.
mask = np.zeros_like(data) mask[timeunits!=myunit,:,:] = 1.0 # Calculate missing data mask within each time unit... datamask_at_this_timeunit = np.zeros_like(data) datamask_at_this_timeunit[:]= process.create_mask_using_threshold(data[timeunits==myunit,:,:],threshold=0.75) # Store results for masking later datamask_store.append(datamask_at_this_timeunit[0]) # Calculate means for each pixel in this time unit, ignoring missing data (using masked array). datam = ma.masked_array(data,np.logical_or(mask,datamask_at_this_timeunit)) meanstore[i,:,:] = ma.average(datam,axis=0)
That block is suspect since the rest of the code is doing simple string parsing and appending to lists. I don't have the time to do a deep dive into this now, and it technically isn't broken, but just really slow.