Skip to content

Commit

Permalink
Merge pull request #134 from grillazz/129-import-xlsx-endpoint
Browse files Browse the repository at this point in the history
129 import xlsx endpoint
  • Loading branch information
grillazz authored Feb 17, 2024
2 parents a5ea9f8 + e65b57e commit 2325704
Show file tree
Hide file tree
Showing 8 changed files with 263 additions and 20 deletions.
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ safety: ## Check project and dependencies with safety https://github.com/pyupio/

.PHONY: py-upgrade
py-upgrade: ## Upgrade project py files with pyupgrade library for python version 3.10
pyupgrade --py311-plus `find app -name "*.py"`
pyupgrade --py312-plus `find app -name "*.py"`

.PHONY: lint
lint: ## Lint project code.
Expand Down
31 changes: 19 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,28 +26,25 @@
<li><a href="#how-to-feed-database">How to feed database</a></li>
<li><a href="#rainbow-logs-with-rich">Rainbow logs with rich</a></li>
<li><a href="#setup-user-auth">Setup user auth</a></li>
<li><a href="#local-development-with-poetry">Local development with poetry</a></li>
<li><a href="#import-xlsx-files-with-polars-and-calamine">Import xlsx files with polars and calamine</a></li>
</ul>
</li>

[//]: # ( <li><a href="#usage">Usage</a></li>)

[//]: # ( <li><a href="#roadmap">Roadmap</a></li>)

[//]: # ( <li><a href="#contributing">Contributing</a></li>)

[//]: # ( <li><a href="#license">License</a></li>)

[//]: # ( <li><a href="#contact">Contact</a></li>)
<li><a href="#acknowledgments">Acknowledgments</a></li>
</ol>
</details>

[//]: # (TODO: Usage,Roadmap, Contributing, License, Contact)






## About The Project

Example of [FastAPI](https://fastapi.tiangolo.com/) integration supported by almighty [Pydantic 2.0](https://github.com/pydantic/pydantic)
with [SQLAlchemy ORM](https://www.sqlalchemy.org/) and PostgreSQL
with [SQLAlchemy ORM](https://www.sqlalchemy.org/) and PostgreSQL16
connected via fastest Database Client Library for python/asyncio [asyncpg](https://github.com/MagicStack/asyncpg).

Beside of using latest and greatest version of [SQLAlchemy](https://www.sqlalchemy.org/) with it robustness, powerfulness and speed
Expand Down Expand Up @@ -131,13 +128,22 @@ poetry install
```
Hope you enjoy it.

### Import xlsx files with polars and calamine
Power of Polars Library in data manipulation and analysis.
It uses the polars library to read the Excel data into a DataFrame by passing the bytes to the `pl.read_excel()` function -
https://docs.pola.rs/py-polars/html/reference/api/polars.read_excel.html
In `pl.read_excel()` “calamine” engine can be used for reading all major types of Excel Workbook (.xlsx, .xlsb, .xls) and is dramatically faster than the other options, using the fastexcel module to bind calamine.

<p align="right">(<a href="#readme-top">back to top</a>)</p>

## Acknowledgments
Use this space to list resources you find helpful and would like to give credit to.
I've included a few of my favorites to kick things off!

* [Open Source Shakespeare Dataset](https://github.com/catherinedevlin/opensourceshakespeare)
* [SQL Code Generator](https://github.com/agronholm/sqlacodegen)
* [Passlib - password hashing library for Python](https://passlib.readthedocs.io/en/stable/)
* [Polars - fast DataFrame library for Rust and Python](https://docs.pola.rs/)

<p align="right">(<a href="#readme-top">back to top</a>)</p>

Expand All @@ -155,7 +161,8 @@ I've included a few of my favorites to kick things off!
- **[JUL 25 2023]** add user authentication with JWT and Redis as token storage :lock: :key:
- **[SEP 2 2023]** add passlib and bcrypt for password hashing :lock: :key:
- **[OCT 21 2023]** refactor shakespeare models to use sqlalchemy 2.0 :fast_forward:
- **[FEB 1 2024]** bum project to Python 3.12 :fast_forward:
- **[FEB 1 2024]** bump project to Python 3.12 :fast_forward:
- **[MAR 15 2024]** add polars and calamine to project :features:
<p align="right">(<a href="#readme-top">back to top</a>)</p>


Expand Down
61 changes: 60 additions & 1 deletion app/api/nonsense.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,7 @@
from fastapi import APIRouter, Depends, status
import io
from fastapi import APIRouter, Depends, status, UploadFile, HTTPException
from sqlalchemy.exc import SQLAlchemyError
import polars as pl
from sqlalchemy.ext.asyncio import AsyncSession

from app.database import get_db
Expand Down Expand Up @@ -48,3 +51,59 @@ async def merge_nonsense(
nonsense = Nonsense(**payload.model_dump())
await nonsense.save_or_update(db_session)
return nonsense


@router.post(
"/import",
status_code=status.HTTP_201_CREATED,
)
async def import_nonsense(
xlsx: UploadFile,
db_session: AsyncSession = Depends(get_db),
):
"""
This function is a FastAPI route handler that imports data from an Excel file into a database.
Args:
xlsx (UploadFile): The Excel file that will be uploaded by the client.
db_session (AsyncSession): A SQLAlchemy session for interacting with the database.
Returns:
dict: A dictionary containing the filename and the number of imported records.
Raises:
HTTPException: If an error occurs during the process (either a SQLAlchemy error or an HTTP exception),
the function rolls back the session and raises an HTTP exception with a 422 status code.
"""
try:
# Read the uploaded file into bytes
file_bytes = await xlsx.read()

# Use the `polars` library to read the Excel data into a DataFrame
nonsense_data = pl.read_excel(
source=io.BytesIO(file_bytes),
sheet_name="New Nonsense",
engine="calamine",
)
# Iterate over the DataFrame rows and create a list of `Nonsense` objects
nonsense_records = [
Nonsense(
name=nonsense.get("name"),
description=nonsense.get("description"),
)
for nonsense in nonsense_data.to_dicts()
]
# Add all the `Nonsense` objects to the SQLAlchemy session
db_session.add_all(nonsense_records)
# Commit the session to save the objects to the database
await db_session.commit()
# Return a JSON response containing the filename and the number of imported records
return {"filename": xlsx.filename, "nonsense_records": len(nonsense_records)}
except (SQLAlchemyError, HTTPException, ValueError) as ex:
# If an error occurs, roll back the session
await db_session.rollback()
# Raise an HTTP exception with a 422 status code
raise HTTPException(status_code=status.HTTP_422_UNPROCESSABLE_ENTITY, detail=repr(ex)) from ex
finally:
# Ensure that the database session is closed, regardless of whether an error occurred or not
await db_session.close()
3 changes: 3 additions & 0 deletions app/models/shakespeare.py
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@ class Wordform(Base):
- `occurrences` (int): The number of occurrences of the word form.
"""

__tablename__ = "wordform"
__table_args__ = (PrimaryKeyConstraint("id", name="wordform_pkey"), {"schema": "shakespeare"})

Expand Down Expand Up @@ -133,6 +134,7 @@ class Chapter(Base):
- `paragraph` (list[Paragraph]): The paragraphs associated with the chapter.
"""

__tablename__ = "chapter"
__table_args__ = (
ForeignKeyConstraint(["work_id"], ["shakespeare.work.id"], name="chapter_work_id_fkey"),
Expand Down Expand Up @@ -193,6 +195,7 @@ class Paragraph(Base):
- `find(cls, db_session: AsyncSession, character: str) -> List[Paragraph]`: A class method that finds paragraphs associated with a specific character. It takes a database session and the name of the character as arguments, and returns a list of matching paragraphs.
"""

__tablename__ = "paragraph"
__table_args__ = (
ForeignKeyConstraint(["character_id"], ["shakespeare.character.id"], name="paragraph_character_id_fkey"),
Expand Down
6 changes: 1 addition & 5 deletions app/models/stuff.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,11 +27,7 @@ async def find(cls, db_session: AsyncSession, name: str):
:param name:
:return:
"""
stmt = (
select(cls)
.options(joinedload(cls.nonsense))
.where(cls.name == name)
)
stmt = select(cls).options(joinedload(cls.nonsense)).where(cls.name == name)
result = await db_session.execute(stmt)
instance = result.scalars().first()
if instance is None:
Expand Down
Loading

0 comments on commit 2325704

Please sign in to comment.