You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This commit was created on GitHub.com and signed with GitHub’s verified signature.
Enhancements
Driver for MongoDB connector. Adds a driver with unstructured version information to the
MongoDB connector.
Features
Add Databricks Volumes destination connector Databricks Volumes connector added to ingest CLI. Users may now use unstructured-ingest to write partitioned data to a Databricks Volumes storage service.
Fixes
Fix support for different Chipper versions and prevent running PDFMiner with Chipper
Treat YAML files as text. Adds YAML MIME types to the file detection code and treats those
files as text.
Fix FSSpec destination connectors check_connection. FSSpec destination connectors did not use check_connection. There was an error when trying to ls destination directory - it may not exist at the moment of connector creation. Now check_connection calls ls on bucket root and this method is called on initialize of destination connector.
Fix databricks-volumes extra location.setup.py is currently pointing to the wrong location for the databricks-volumes extra requirements. This results in errors when trying to build the wheel for unstructured. This change updates to point to the correct path.
Fix uploading None values to Chroma and Pinecone. Removes keys with None values with Pinecone and Chroma destinations. Pins Pinecone dependency
Update documentation. (i) best practice for table extration by using 'skip_infer_table_types' param, instead of 'pdf_infer_table_structure', and (ii) fixed CSS, RST issues and typo in the documentation.
Fix postgres storage of link_texts. Formatting of link_texts was breaking metadata storage.