- Series.describe(percentiles=None, include=None, exclude=None)[source]#
Generate descriptive statistics.
Descriptive statistics include those that summarize the centraltendency, dispersion and shape of adataset’s distribution, excluding
NaN
values.Analyzes both numeric and object series, as wellas
DataFrame
column sets of mixed data types. The outputwill vary depending on what is provided. Refer to the notesbelow for more detail.- Parameters:
- percentileslist-like of numbers, optional
The percentiles to include in the output. All shouldfall between 0 and 1. The default is
[.25, .5, .75]
, which returns the 25th, 50th, and75th percentiles.- include‘all’, list-like of dtypes or None (default), optional
A white list of data types to include in the result. Ignoredfor
Series
. Here are the options:‘all’ : All columns of the input will be included in the output.
A list-like of dtypes : Limits the results to theprovided data types.To limit the result to numeric types submit
numpy.number
. To limit it instead to object columns submitthenumpy.object
data type. Stringscan also be used in the style ofselect_dtypes
(e.g.df.describe(include=['O'])
). Toselect pandas categorical columns, use'category'
None (default) : The result will include all numeric columns.
- excludelist-like of dtypes or None (default), optional,
A black list of data types to omit from the result. Ignoredfor
Series
. Here are the options:A list-like of dtypes : Excludes the provided data typesfrom the result. To exclude numeric types submit
numpy.number
. To exclude object columns submit the datatypenumpy.object
. Strings can also be used in the style ofselect_dtypes
(e.g.df.describe(exclude=['O'])
). Toexclude pandas categorical columns, use'category'
None (default) : The result will exclude nothing.
- Returns:
- Series or DataFrame
Summary statistics of the Series or Dataframe provided.
See also
- DataFrame.count
Count number of non-NA/null observations.
- DataFrame.max
Maximum of the values in the object.
- DataFrame.min
Minimum of the values in the object.
- DataFrame.mean
Mean of the values.
- DataFrame.std
Standard deviation of the observations.
- DataFrame.select_dtypes
Subset of a DataFrame including/excluding columns based on their dtype.
Notes
For numeric data, the result’s index will include
count
,mean
,std
,min
,max
as well as lower,50
andupper percentiles. By default the lower percentile is25
and theupper percentile is75
. The50
percentile is thesame as the median.For object data (e.g. strings or timestamps), the result’s indexwill include
count
,unique
,top
, andfreq
. Thetop
is the most common value. Thefreq
is the most common value’sfrequency. Timestamps also include thefirst
andlast
items.If multiple object values have the highest count, then the
count
andtop
results will be arbitrarily chosen fromamong those with the highest count.For mixed data types provided via a
DataFrame
, the default is toreturn only an analysis of numeric columns. If the dataframe consistsonly of object and categorical data without any numeric columns, thedefault is to return an analysis of both the object and categoricalcolumns. Ifinclude='all'
is provided as an option, the resultwill include a union of attributes of each type.The include and exclude parameters can be used to limitwhich columns in a
DataFrame
are analyzed for the output.The parameters are ignored when analyzing aSeries
.Examples
Describing a numeric
Series
.>>> s = pd.Series([1, 2, 3])>>> s.describe()count 3.0mean 2.0std 1.0min 1.025% 1.550% 2.075% 2.5max 3.0dtype: float64
Describing a categorical
Series
.>>> s = pd.Series(['a', 'a', 'b', 'c'])>>> s.describe()count 4unique 3top afreq 2dtype: object
Describing a timestamp
Series
.>>> s = pd.Series([... np.datetime64("2000-01-01"),... np.datetime64("2010-01-01"),... np.datetime64("2010-01-01")... ])>>> s.describe()count 3mean 2006-09-01 08:00:00min 2000-01-01 00:00:0025% 2004-12-31 12:00:0050% 2010-01-01 00:00:0075% 2010-01-01 00:00:00max 2010-01-01 00:00:00dtype: object
Describing a
DataFrame
. By default only numeric fieldsare returned.>>> df = pd.DataFrame({'categorical': pd.Categorical(['d', 'e', 'f']),... 'numeric': [1, 2, 3],... 'object': ['a', 'b', 'c']... })>>> df.describe() numericcount 3.0mean 2.0std 1.0min 1.025% 1.550% 2.075% 2.5max 3.0
Describing all columns of a
DataFrame
regardless of data type.>>> df.describe(include='all') categorical numeric objectcount 3 3.0 3unique 3 NaN 3top f NaN afreq 1 NaN 1mean NaN 2.0 NaNstd NaN 1.0 NaNmin NaN 1.0 NaN25% NaN 1.5 NaN50% NaN 2.0 NaN75% NaN 2.5 NaNmax NaN 3.0 NaN
Describing a column from a
DataFrame
by accessing it asan attribute.>>> df.numeric.describe()count 3.0mean 2.0std 1.0min 1.025% 1.550% 2.075% 2.5max 3.0Name: numeric, dtype: float64
Including only numeric columns in a
DataFrame
description.>>> df.describe(include=[np.number]) numericcount 3.0mean 2.0std 1.0min 1.025% 1.550% 2.075% 2.5max 3.0
Including only string columns in a
DataFrame
description.>>> df.describe(include=[object]) objectcount 3unique 3top afreq 1
Including only categorical columns from a
DataFrame
description.>>> df.describe(include=['category']) categoricalcount 3unique 3top dfreq 1
Excluding numeric columns from a
DataFrame
description.>>> df.describe(exclude=[np.number]) categorical objectcount 3 3unique 3 3top f afreq 1 1
Excluding object columns from a
DataFrame
description.>>> df.describe(exclude=[object]) categorical numericcount 3 3.0unique 3 NaNtop f NaNfreq 1 NaNmean NaN 2.0std NaN 1.0min NaN 1.025% NaN 1.550% NaN 2.075% NaN 2.5max NaN 3.0
pandas.Series.describe — pandas 2.2.2 documentation (2024)
References
- https://www.analyticsvidhya.com/blog/2023/10/building-and-validating-simple-stock-trading-algorithms-using-python/
- https://www.opensourceagenda.com/projects/georinex
- https://numpy.org/doc/stable/user/basics.interoperability.html
- https://www.kdnuggets.com/beginners-guide-to-machine-learning-with-python
- https://www.pythonorg.cn/other_docs/docs_collect/pandas/reference/api/pandas.Series.describe.html
- https://docs.terrabyte.lrz.de/services/data-processing/data-cubes/stac-xarray-dask/
Top Articles
Cintas Partner Connect: Effortless Access, Secure Login, and FAQs
Navigating Cintas Partner Connect: Empowering Employee Engagement and Efficiency - Voicescoop
Craigslist West Valley
Red Door Broadview
Prayer Times in Nuremberg, BY, Germany
Where to Farm Shard of Naydra's Fangs: Locations and Prices | Zelda: Breath of the Wild (BotW)|Game8
Car dealerships in North America revert to pens and paper after cyberattacks on software provider
Tracking para Gmail, Mailsuite - Mailtrack
Dumb Money - Schnelles Geld
Dumb Money | Rotten Tomatoes
Restored Republic August 8 2023
Publix Christmas Dinner 2022
Latest Posts
Here's where things stand with the controversial Line 5 reroute in northern Wisconsin
The News Journal from Wilmington, Delaware
Article information
Author: Foster Heidenreich CPA
Last Updated:
Views: 6708
Rating: 4.6 / 5 (76 voted)
Reviews: 91% of readers found this page helpful
Author information
Name: Foster Heidenreich CPA
Birthday: 1995-01-14
Address: 55021 Usha Garden, North Larisa, DE 19209
Phone: +6812240846623
Job: Corporate Healthcare Strategist
Hobby: Singing, Listening to music, Rafting, LARPing, Gardening, Quilting, Rappelling
Introduction: My name is Foster Heidenreich CPA, I am a delightful, quaint, glorious, quaint, faithful, enchanting, fine person who loves writing and wants to share my knowledge and understanding with you.