v.0.7.0 (February 9, 2012)¶
New features¶
- New unified merge function for efficiently performing full gamut of database / relational-algebra operations. Refactored existing join methods to use the new infrastructure, resulting in substantial performance gains (GH220, GH249, GH267)
- New unified concatenation function for concatenating
Series, DataFrame or Panel objects along an axis. Can form union or
intersection of the other axes. Improves performance of
Series.appendandDataFrame.append(GH468, GH479, GH273) - Can pass multiple DataFrames to
DataFrame.append to concatenate (stack) and multiple Series to
Series.appendtoo - Can pass list of dicts (e.g., a list of JSON objects) to DataFrame constructor (GH526)
- You can now set multiple columns in a
DataFrame via
__getitem__, useful for transformation (GH342) - Handle differently-indexed output values in
DataFrame.apply(GH498)
In [1]: df = pd.DataFrame(np.random.randn(10, 4))
In [2]: df.apply(lambda x: x.describe())
Out[2]:
0 1 2 3
count 10.000000 10.000000 10.000000 10.000000
mean 0.190912 -0.395125 -0.731920 -0.403130
std 0.730951 0.813266 1.112016 0.961912
min -0.861849 -2.104569 -1.776904 -1.469388
25% -0.411391 -0.698728 -1.501401 -1.076610
50% 0.380863 -0.228039 -1.191943 -1.004091
75% 0.658444 0.057974 -0.034326 0.461706
max 1.212112 0.577046 1.643563 1.071804
[8 rows x 4 columns]
- Add
reorder_levelsmethod to Series and DataFrame (GH534) - Add dict-like
getfunction to DataFrame and Panel (GH521) - Add
DataFrame.iterrowsmethod for efficiently iterating through the rows of a DataFrame - Add
DataFrame.to_panelwith code adapted fromLongPanel.to_long - Add
reindex_axismethod added to DataFrame - Add
leveloption to binary arithmetic functions onDataFrameandSeries - Add
leveloption to thereindexandalignmethods on Series and DataFrame for broadcasting values across a level (GH542, GH552, others) - Add attribute-based item access to
Paneland add IPython completion (GH563) - Add
logyoption toSeries.plotfor log-scaling on the Y axis - Add
indexandheaderoptions toDataFrame.to_string - Can pass multiple DataFrames to
DataFrame.jointo join on index (GH115) - Can pass multiple Panels to
Panel.join(GH115) - Added
justifyargument toDataFrame.to_stringto allow different alignment of column headers - Add
sortoption to GroupBy to allow disabling sorting of the group keys for potential speedups (GH595) - Can pass MaskedArray to Series constructor (GH563)
- Add Panel item access via attributes and IPython completion (GH554)
- Implement
DataFrame.lookup, fancy-indexing analogue for retrieving values given a sequence of row and column labels (GH338) - Can pass a list of functions to aggregate with groupby on a DataFrame, yielding an aggregated result with hierarchical columns (GH166)
- Can call
cumminandcummaxon Series and DataFrame to get cumulative minimum and maximum, respectively (GH647) value_rangeadded as utility function to get min and max of a dataframe (GH288)- Added
encodingargument toread_csv,read_table,to_csvandfrom_csvfor non-ascii text (GH717) - Added
absmethod to pandas objects - Added
crosstabfunction for easily computing frequency tables - Added
isinmethod to index objects - Added
levelargument toxsmethod of DataFrame.
API changes to integer indexing¶
One of the potentially riskiest API changes in 0.7.0, but also one of the most important, was a complete review of how integer indexes are handled with regard to label-based indexing. Here is an example:
In [3]: s = pd.Series(np.random.randn(10), index=range(0, 20, 2))
In [4]: s
Out[4]:
0 -1.294524
2 0.413738
4 0.276662
6 -0.472035
8 -0.013960
10 -0.362543
12 -0.006154
14 -0.923061
16 0.895717
18 0.805244
Length: 10, dtype: float64
In [5]: s[0]