clustering
Projects with this topic
-
Library and tools for similarity measurement, classification and clustering of digital content and segmentation images from digitized document
Updated -
A regular expression generator for arbitrary sets of strings. Returns the patterns with exact or generalised character sets, depending on the choice of the user, and facilitates clustering over patterns to create superpatterns.
Updated -
A program that utilizes cluster computing and parallel programming to simulate trading strategies on the Nordic stock market.
Updated -
The end.. and maybe the beginning of the LinuxPMI kernel clustering extensions. Based on openmosix.
Updated -
"Cloud container data analytics, statistical modeling, and machine learning on distributed databases". "A free opensource alternative to SPSS, SAS, MATLAB, PowerBI, Tableau and Alteryx". Runs on Linux, Windows, MacOS, and in the cloud via containers.
Updated -
DPCfam Workstation version. Runs on Linux-based systems. Developed and tested on Ubuntu 18. DPCfamW uses the moodycamel::ConcurrentQueue library ( https://github.com/cameron314/concurrentqueue ) freely available provided citation (Simplified BSD license). This version replicates the pipeline used in to anlayze UniRef50 (v. 2017_07) as in Unsupervised protein family classification by Density Peak clustering, Russo ET, 2020, PhD Thesis ( http://hdl.handle.net/20.500.11767/116345 ), but with smaller datasets. Largest dataset we analysed is the TESTproteins_cd50.fasta datased we provide in this package. Due to memory bounds we do not guarantee that the abalysis of largest datasets is acheivable with this version.
Updated -
This repository contiains the implementation of DPC-based algorithm as described in Russo, E.T., Laio, A. & Punta, M. Density Peak clustering of protein sequences associated to a Pfam clan reveals clear similarities and interesting differences with respect to manual family annotation. BMC Bioinformatics 22, 121 (2021). https://doi.org/10.1186/s12859-021-04013-x. Note that the implementation has been written with the puropose of analysing, on a traditional workstation (8GB ram, 4-8 cores), query datasets with up to 5000 proteins, as those analysed in the reference paper.
Updated -
pyAMNESIA: a python pipeline for analysing the Activity and Morphology of NEurons using Skeletonization and other Image Analysis techniques.
Updated -
Code for Bayesian Deep Learning Workshop, NIPS 2017
Is Simple Better?: Revisiting Simple Generative Models for Unsupervised Clustering
Updated -
This repo presents performance comparisons between a serial implementation, a MPI based and a Spark based implementation of a document clustering algorithm
Updated