Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: DataFrame.copy preserve row/column order #44871

Closed
wants to merge 11 commits into from

Conversation

jbrockmendel
Copy link
Member

@jbrockmendel jbrockmendel commented Dec 13, 2021

By passing order="K" we get a faster copy, and end up improving on a bunch of benchmarks where ArrayManager excels. On the downside, we take a hit in some arithmetic cases which I'm still troubleshooting. The dropna slowdowns in the asvs pasted below should behave been improved by #44857.

updated asv results

       before           after         ratio
     [adfc78b1]       [8605af4b]
     <fixmes23>       <perf-copy-K>
+     6.61±0.03ms      13.6±0.06ms     2.06  arithmetic.FrameWithFrameWide.time_op_different_blocks(<built-in function gt>, (1000000, 10))
+     6.84±0.08ms       12.7±0.1ms     1.85  arithmetic.FrameWithFrameWide.time_op_different_blocks(<built-in function gt>, (100000, 100))
+      15.3±0.6ms       22.6±0.2ms     1.47  arithmetic.FrameWithFrameWide.time_op_different_blocks(<built-in function add>, (1000000, 10))
+      16.8±0.2ms       23.0±0.2ms     1.37  arithmetic.FrameWithFrameWide.time_op_different_blocks(<built-in function add>, (100000, 100))
+     6.58±0.02ms      8.35±0.06ms     1.27  arithmetic.FrameWithFrameWide.time_op_different_blocks(<built-in function gt>, (10000, 1000))
+        21.5±1ms         27.3±2ms     1.27  arithmetic.FrameWithFrameWide.time_op_different_blocks(<built-in function floordiv>, (10000, 1000))
+      18.2±0.2ms       22.5±0.1ms     1.24  frame_methods.MaskBool.time_frame_mask_bools
+        24.8±2ms       30.2±0.7ms     1.22  arithmetic.FrameWithFrameWide.time_op_different_blocks(<built-in function floordiv>, (1000000, 10))
+         852±5μs      1.02±0.02ms     1.20  timeseries.AsOf.time_asof_nan_single('DataFrame')
+      22.3±0.3ms       26.5±0.8ms     1.19  arithmetic.FrameWithFrameWide.time_op_different_blocks(<built-in function floordiv>, (1000, 10000))
+        40.8±4ms       48.2±0.2ms     1.18  algos.isin.IsinWithRandomFloat.time_isin(<class 'numpy.float64'>, 750000, 'outside')
+        24.0±2ms       28.0±0.9ms     1.16  arithmetic.FrameWithFrameWide.time_op_different_blocks(<built-in function floordiv>, (100000, 100))
+      21.1±0.7ms       24.4±0.9ms     1.16  categoricals.Constructor.time_regular
+      11.4±0.2ms       13.2±0.8ms     1.15  io.pickle.Pickle.time_read_pickle
+      15.9±0.2ms         18.3±2ms     1.15  groupby.Nth.time_series_nth_all('datetime')
+        94.1±1μs         108±10μs     1.14  groupby.GroupByMethods.time_dtype_as_field('int', 'size', 'direct', 5)
+     2.65±0.03ms       2.98±0.2ms     1.12  frame_methods.Fillna.time_frame_fillna(True, 'pad', 'float32')
+      18.1±0.3ms         20.3±2ms     1.12  groupby.Nth.time_frame_nth_any('datetime')
+      8.53±0.3ms       9.48±0.3ms     1.11  arithmetic.FrameWithFrameWide.time_op_different_blocks(<built-in function gt>, (1000, 10000))
+      1.94±0.1μs      2.16±0.03μs     1.11  indexing_engines.NumericEngineIndexing.time_get_loc_near_middle((<class 'pandas._libs.index.Int8Engine'>, <class 'numpy.int8'>), 'monotonic_incr', False, 2000000)
+      1.71±0.2μs       1.90±0.2μs     1.11  index_cached_properties.IndexCache.time_inferred_type('CategoricalIndex')
+     1.68±0.02ms       1.86±0.2ms     1.11  timeseries.ResampleSeries.time_resample('datetime', '5min', 'ohlc')
+      16.1±0.1ms       17.8±0.1ms     1.10  arithmetic.FrameWithFrameWide.time_op_different_blocks(<built-in function add>, (10000, 1000))
-        138±20μs        125±0.8μs     0.91  series_methods.NanOps.time_func('sem', 1000, 'int32')
-        709±20ns         640±20ns     0.90  index_cached_properties.IndexCache.time_is_monotonic('RangeIndex')
-      24.0±0.2ms       21.6±0.3ms     0.90  frame_methods.Equals.time_frame_nonunique_unequal
-        613±60μs          551±8μs     0.90  arithmetic.IndexArithmetic.time_subtract('int')
-        215±10μs          194±4μs     0.90  algos.isin.IsIn.time_isin('Int64')
-        866±20μs         778±10μs     0.90  algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.int64'>, 16, 'inside')
-        577±90μs         518±10μs     0.90  arithmetic.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function le>)
-      7.83±0.2ms       7.02±0.2ms     0.90  categoricals.ValueCounts.time_value_counts(True)
-      23.9±0.2ms       21.4±0.3ms     0.89  frame_methods.Equals.time_frame_nonunique_equal
-         203±1μs          181±3μs     0.89  join_merge.Append.time_append_homogenous
-        752±10μs          669±5μs     0.89  algorithms.Quantile.time_quantile(0.5, 'higher', 'int')
-        678±30μs          600±4μs     0.89  arithmetic.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 2, <built-in function ge>)
-      2.83±0.4μs       2.50±0.1μs     0.89  index_cached_properties.IndexCache.time_shape('IntervalIndex')
-      3.75±0.2ms       3.32±0.2ms     0.88  frame_methods.Fillna.time_frame_fillna(True, 'bfill', 'float64')
-        268±30μs          237±2μs     0.88  groupby.GroupByMethods.time_dtype_as_group('object', 'unique', 'direct', 1)
-        20.2±1ms       17.9±0.1ms     0.88  groupby.Nth.time_frame_nth_any('float64')
-      24.4±0.5ms      21.5±0.09ms     0.88  frame_methods.Equals.time_frame_object_equal
-      7.43±0.4ms      6.55±0.05ms     0.88  arithmetic.FrameWithFrameWide.time_op_same_blocks(<built-in function floordiv>, (1000, 10000))
-      2.62±0.3ms      2.30±0.02ms     0.88  arithmetic.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 4, <built-in function add>)
-      8.72±0.9ms      7.65±0.09ms     0.88  algos.isin.IsinWithRandomFloat.time_isin(<class 'numpy.object_'>, 80000, 'outside')
-         600±5μs          526±6μs     0.88  arithmetic.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function ge>)
-     1.60±0.06μs      1.40±0.05μs     0.87  indexing_engines.NumericEngineIndexing.time_get_loc((<class 'pandas._libs.index.Int64Engine'>, <class 'numpy.int64'>), 'monotonic_incr', True, 2000000)
-         160±8ms          139±8ms     0.87  algos.isin.IsInLongSeriesValuesDominate.time_isin('object', 'monotone')
-      2.62±0.3ms      2.28±0.03ms     0.87  arithmetic.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function add>)
-     8.03±0.09ms      6.94±0.06ms     0.86  indexing.NumericSeriesIndexing.time_loc_scalar(<class 'pandas.core.indexes.numeric.UInt64Index'>, 'nonunique_monotonic_inc')
-      2.23±0.2μs       1.92±0.1μs     0.86  index_cached_properties.IndexCache.time_shape('CategoricalIndex')
-     8.04±0.07ms      6.89±0.08ms     0.86  indexing.NumericSeriesIndexing.time_getitem_scalar(<class 'pandas.core.indexes.numeric.UInt64Index'>, 'nonunique_monotonic_inc')
-        442±40μs          378±5μs     0.86  algos.isin.IsinWithArangeSorted.time_isin(<class 'numpy.float64'>, 8000)
-      1.21±0.1μs      1.04±0.08μs     0.85  index_cached_properties.IndexCache.time_values('Float64Index')
-        58.3±7μs       49.7±0.5μs     0.85  algos.isin.IsIn.time_isin_mismatched_dtype('datetime64[ns]')
-      5.20±0.1ms       4.41±0.4ms     0.85  arithmetic.NumericInferOps.time_modulo(<class 'numpy.uint64'>)
-        175±20μs          148±1μs     0.84  algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.float64'>, 12, 'inside')
-      7.72±0.3ms      6.50±0.07ms     0.84  algorithms.Duplicated.time_duplicated(False, False, 'int')
-        819±30μs         688±20μs     0.84  algorithms.Quantile.time_quantile(0.5, 'lower', 'uint')
-      4.71±0.2μs       3.92±0.1μs     0.83  algorithms.Duplicated.time_duplicated(True, 'last', 'uint')
-      2.03±0.3μs       1.70±0.2μs     0.83  index_cached_properties.IndexCache.time_values('TimedeltaIndex')
-      1.24±0.2μs      1.03±0.03μs     0.83  index_cached_properties.IndexCache.time_values('PeriodIndex')
-      8.60±0.7ms       7.13±0.2ms     0.83  algos.isin.IsIn.time_isin('category[object]')
-      9.17±0.3ms       7.42±0.2ms     0.81  algorithms.Factorize.time_factorize(False, False, 'int')
-        186±20μs          150±3μs     0.81  algos.isin.IsIn.time_isin_empty('Int64')
-      16.7±0.3ms       13.3±0.2ms     0.80  arithmetic.Ops.time_frame_multi_and(True, 1)
-        89.7±8μs         71.0±1μs     0.79  algos.isin.IsinAlmostFullWithRandomInt.time_isin(<class 'numpy.uint64'>, 11, 'outside')
-      16.3±0.6ms       12.8±0.4ms     0.78  algorithms.Factorize.time_factorize(False, True, 'Int64')
-       700±100μs          540±4μs     0.77  arithmetic.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function lt>)
-      10.3±0.2ms      7.84±0.09ms     0.76  arithmetic.Ops.time_frame_multi_and(True, 'default')
-     1.15±0.07ms         867±20μs     0.75  algorithms.Quantile.time_quantile(1, 'higher', 'float')
-      14.6±0.5ms       10.9±0.2ms     0.75  arithmetic.Ops.time_frame_multi_and(False, 'default')
-      14.6±0.6ms       10.9±0.2ms     0.74  arithmetic.Ops.time_frame_multi_and(False, 1)
-      34.0±0.3ms       21.6±0.2ms     0.63  frame_methods.Equals.time_frame_object_unequal
-         110±2ms       47.3±0.9ms     0.43  join_merge.ConcatDataFrames.time_c_ordered(1, False)
-         110±2ms       46.2±0.4ms     0.42  join_merge.ConcatDataFrames.time_c_ordered(1, True)
-      64.5±0.4ms       21.7±0.4ms     0.34  frame_methods.Rename.time_dict_rename_both_axes
-      64.5±0.2ms       21.7±0.3ms     0.34  frame_methods.Rename.time_rename_both_axes
-      63.3±0.3ms       20.5±0.2ms     0.32  frame_methods.Rename.time_rename_axis0
-      61.7±0.3ms       18.8±0.2ms     0.30  frame_methods.Rename.time_rename_single
-     6.02±0.06ms      1.75±0.01ms     0.29  frame_methods.Equals.time_frame_float_unequal
-      59.5±0.4ms       16.8±0.3ms     0.28  frame_methods.Rename.time_rename_axis1

@jreback jreback added the Performance Memory or execution speed performance label Dec 13, 2021
@jreback
Copy link
Contributor

jreback commented Jan 16, 2022

@jbrockmendel rebase & ping when ready

@github-actions
Copy link
Contributor

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label Feb 16, 2022
@simonjayhawkins
Copy link
Member

closing as stale

@jbrockmendel jbrockmendel deleted the perf-copy-K branch February 22, 2023 21:48
@jbrockmendel jbrockmendel added the Mothballed Temporarily-closed PR the author plans to return to label Mar 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Mothballed Temporarily-closed PR the author plans to return to Performance Memory or execution speed performance Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants