Python port of the R data package babynames
. This package provides US baby names data from the Social Security Administration (SSA). It contains all names used for at least 5 children of either sex in the United States. The package features the ability to switch between the data being imported as a Polars DataFrame (default) or a Pandas DataFrame by setting an environment variable.
Note
Please note that the pybabynames
package is a community-driven initiative and is not affiliated with Posit, Tidyverse, or the main babynames R package.
Its evolution and maintenance stem solely from the collective efforts of community members.
Install this library using pip
into an environment that already has either Pandas or Polars installed.
pip install pybabynames
Missing Pandas or Polars? You can install these packages using:
pip install polars
pip install pandas
import pybabynames as bn
# Retrieve DataFrame of baby names
babynames = bn.babynames
# Retrieve DataFrame of applicant data for SSN
applicants = bn.applicants
# Retrieve DataFrame of Birth Data
births = bn.births
# Retrieve DataFrame of life expectancy
lifetables = bn.lifetables
Important
By default, we'll attempt to use the polars
module. You can switch back to using pandas
by
specifying before babynames
import statement an environment flag like so:
# Specify desired DataFrame framework
import os
os.environ["DATAFRAME_FRAMEWORK"] = "pandas"
# Load the package
import pybabynames as bn
To contribute to this library, first checkout the code. Then create a new virtual environment:
cd pybabynames
python -m venv venv
source venv/bin/activate
Now install the dependencies and test dependencies:
python -m pip install -e '.[test]'
To run the tests:
python -m pytest
This Python package is a port of the R Data package babynames
by Hadley Wickham.