Did you have to manually copy to paste data from your PDFs, invoices and other barbing documents to then have to enter them then in a spreadsheet or database? Well relou?

Well I have a solution that will save you precious time! This is called Sparrowand it is an open source project that uses artificial intelligence to automate data extraction.

Sparrow Architecture

The tool is thus able to analyze your documents and automatically extract its important information through its modular architecture including several specialized components:

  • Sparrow sprinkled : The heart of the system that uses LLM vision models to understand visual content
  • Sparrow OCR : An efficient character recognition service
  • Sparrow ML LLM : The main engine that manages AI agents
  • Sparrow UI : An elegant graphical interface to pilot everything

What makes Sparrow particularly interesting is above all its ability to adapt to your needs. You can use it locally on your Mac with MLX, or opt for a cloud version with more powerful GPUs.

To start using Sparrow, here are the steps:

  1. Installation of the Python environment:
# Installer pyenv d'abord (si ce n'est pas déjà fait)
# Puis créer un environnement virtuel Python
pyenv install python
python -m venv venv
source venv/bin/activate # Sur Unix/Mac
# ou
.\venv\Scripts\activate # Sur Windows
  1. Clone the Sparrow deposit:
git clone https://github.com/katanaml/sparrow.git
cd sparrow
  1. Dependencies installation:
# Installer les requirements selon l'agent que vous souhaitez utiliser
pip install -r requirements.txt
  1. Start the API:
python api.py
# Ou sur un port spécifique :
python api.py --port 8001
  1. To access the web interface (Sparrow UI):

The interface will be accessible to the address:

http://127.0.0.1:8000/api/v1/sparrow-llm/docs

To use Sparrow sprinkled with the local backend mlx:

./sparrow.sh "[votre_schema_json]" \
--agent "sparrow-parse" \
--debug \
--options mlx \
--options mlx-community/Qwen2-VL-72B-Instruct-4bit \
--file-path "/chemin/vers/votre/fichier"

Sparrow UI

Now let’s take a practical example: the extraction of data from a bank statement. With a simple API request, Sparrow can analyze the document and extract it automatically:

  • Account information
  • The balance
  • Transactions history
  • Totals

All structured properly in JSON, ready to be integrated into your applications. No need to tear your hair with regular bancal expressions!

Here is an example of an order to process a document:

curl -X 'POST' \
'http://127.0.0.1:8000/api/v1/sparrow-llm/inference' \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-F 'query=[{"description":"str","amount":0}]' \
-F 'agent=sparrow-parse' \
-F 'options=mlx,mlx-community/Qwen2-VL-72B-Instruct-4bit' \
-F 'debug=false' \
-F 'sparrow_key=' \
-F '[email protected];type=application/pdf'

What really distinguishes Sparrow from other solutions is above all:

  • Its flexibility : compatible with different types of documents (invoices, statements, forms, etc.)
  • Her modular architecture : each component can be used independently
  • His performance : fast treatment even on modest equipment
  • Her Open Source : accessible and modifiable source code

And for companies with a turnover of less than $ 5 million (which is currently my case, Lool), Sparrow is even usable for commercial use.

Here is & mldr; If you want to experiment with Sparrow, I invite you to test the online demo. You will be able to get a concrete idea of ​​his abilities before installing it.

Thank you to Letsar for this superb discovery!

Source


Source link

Categorized in: