Projects with this topic
-
English: Python script to collect data from datasets and insert them into a Postgres database via ORM. It can be expanded to extract other data sources (such as PDFs, Word, CSV, etc.) and apply custom functions. It comes with an Excel spreadsheet of the State of Minas Gerais' expenses from 2005-2016 as an example of a dataset for collection.
pt-BR: Script Python para coletar dados de datasets e inserir em um banco de dados Postgres via ORM. Pode ser expandido para extrair outras fontes de dados (como pdf's, word, csv, etc) e aplicar funções customizadas. Acompanha um excel dos gastos do Estado de Minas Gerais do ano de 2005-2016 como exemplo de dataset para a coleta.
Updated -
English: Python project to extract PDF data from the current directory.
One script transforms the PDF data into a pandas dataframe for eventual conversion to Excel. The other script stores this data in JSON for indexing in a non-relational Elasticsearch database.
pt-BR:Projeto em Python para extração de dados de pdf do diretório atual.
Um script transforma os dados do pdf em pandas dataframe para eventual conversão para excel. O outro script armazena esses dados em JSON para indexar em um banco de dados não relacional Elasticsearch.
Updated -
yq-and-jq-like YAML or JSON file contents dumper with drastically reduced functionality compared to the originals
Updated -
A data scraping and manipulation library with command line interface with multi-database support for MSSQL, MySQL, PSQL, MariaDB and SQLite. View jsdoc and coverage reports at, https://rdfedor.gitlab.io/datahorde
Updated -
Use case using a modified version of MAKI to extract knowledge from public procurement information across Europe.
Updated -
MAKI is a set of tools for web data knowledge extraction.
Updated -
Repository to extract riak data in csv file
Updated