HOME

TheInfoList



OR:

The Collective Knowledge (CK) project is an
open-source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use and view the source code, design documents, or content of the product. The open source model is a decentrali ...
framework and
repository Repository may refer to: Archives and online databases * Content repository, a database with an associated set of data management tools, allowing application-independent access to the content * Disciplinary repository (or subject repository), an ...
to enable collaborative, reproducible and sustainable research and development of complex computational systems. CK is a small, portable, customizable and decentralized infrastructure helping researchers and practitioners: * share their code, data and models as reusable
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (prog ...
components and automation actions with unified
JSON JSON (JavaScript Object Notation, pronounced or ) is an open standard file format and electronic data interchange, data interchange format that uses Human-readable medium and data, human-readable text to store and transmit data objects consi ...
API An application programming interface (API) is a connection between computers or between computer programs. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how to build ...
, JSON meta information, and a UID based on
FAIR principles FAIR data is data which meets the FAIR principles of findability, accessibility, interoperability, and reusability (FAIR). The acronym and principles were defined in a March 2016 paper in the journal '' Scientific Data'' by a consortium of scie ...
* assemble portable workflows from shared components (such as multi-objective autotuning and Design space exploration) * automate,
crowdsource Crowdsourcing involves a large group of dispersed participants contributing or producing goods and services, goods or services—including ideas, Voting, votes, Microwork, micro-tasks, and finances—for payment or as volunteers. Contemporary ...
and reproduce benchmarking of complex computational systems * unify
predictive analytics Predictive analytics encompasses a variety of Statistics, statistical techniques from data mining, Predictive modelling, predictive modeling, and machine learning that analyze current and historical facts to make predictions about future or other ...
(
scikit-learn scikit-learn (formerly scikits.learn and also known as sklearn) is a free and open-source machine learning library for the Python programming language. It features various classification, regression and clustering algorithms including support ...
, R, DNN) * enable reproducible and interactive papers


Notable usages

*
ARM In human anatomy, the arm refers to the upper limb in common usage, although academically the term specifically means the upper arm between the glenohumeral joint (shoulder joint) and the elbow joint. The distal part of the upper limb between ...
uses CK to accelerate computer engineering * Several ACM-sponsored conferences use CK to automate the Artifact Evaluation process * Imperial College (London) uses CK to automate and
crowdsource Crowdsourcing involves a large group of dispersed participants contributing or producing goods and services, goods or services—including ideas, Voting, votes, Microwork, micro-tasks, and finances—for payment or as volunteers. Contemporary ...
compiler In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primaril ...
bug detection * Researchers from the
University of Cambridge The University of Cambridge is a Public university, public collegiate university, collegiate research university in Cambridge, England. Founded in 1209, the University of Cambridge is the List of oldest universities in continuous operation, wo ...
used CK to help the community reproduce results of their publication in the International Symposium on Code Generation and Optimization (CGO'17) during Artifact Evaluation * General Motors (USA) uses CK to crowd-benchmark
convolutional neural network A convolutional neural network (CNN) is a type of feedforward neural network that learns features via filter (or kernel) optimization. This type of deep learning network has been applied to process and make predictions from many different ty ...
optimizations * The
Raspberry Pi Foundation The Raspberry Pi Foundation is a UK-based educational charity founded in 2008 to promote the study of computer science and related subjects globally, particularly among young people. It is best known for initiating the Raspberry Pi series of sing ...
and the cTuning foundation released a CK workflow with a reproducible "live" paper to enable collaborative research into multi-objective autotuning and machine learning techniques *
IBM International Business Machines Corporation (using the trademark IBM), nicknamed Big Blue, is an American Multinational corporation, multinational technology company headquartered in Armonk, New York, and present in over 175 countries. It is ...
uses CK to reproduce quantum results from nature * CK is used to automate MLPerf benchmark


Portable package manager for portable workflows

CK has an integrated cross-platform package manager with
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (prog ...
scripts,
JSON JSON (JavaScript Object Notation, pronounced or ) is an open standard file format and electronic data interchange, data interchange format that uses Human-readable medium and data, human-readable text to store and transmit data objects consi ...
API and
JSON JSON (JavaScript Object Notation, pronounced or ) is an open standard file format and electronic data interchange, data interchange format that uses Human-readable medium and data, human-readable text to store and transmit data objects consi ...
meta-description to automatically rebuild software environment on a user machine required to run a given research workflow.


Reproducibility of experiments

CK enables reproducibility of experimental results via community involvement similar to
Wikipedia Wikipedia is a free content, free Online content, online encyclopedia that is written and maintained by a community of volunteers, known as Wikipedians, through open collaboration and the wiki software MediaWiki. Founded by Jimmy Wales and La ...
and
physics Physics is the scientific study of matter, its Elementary particle, fundamental constituents, its motion and behavior through space and time, and the related entities of energy and force. "Physical science is that department of knowledge whi ...
. Whenever a new workflow with all components is shared via GitHub, anyone can try it on a different machine, with different environment and using slightly different choices (compilers, libraries, data sets). Whenever an unexpected or wrong behavior is encountered, the community explains it, fixes components and shares them back as described in.


References


External links

* Development site

* Documentation

* Public repository with crowdsourced experiments

* International Workshop on Adaptive Self-tuning Computing System (ADAPT) uses CK to enable public reviewing of publications and artifacts via
Reddit Reddit ( ) is an American Proprietary software, proprietary social news news aggregator, aggregation and Internet forum, forum Social media, social media platform. Registered users (commonly referred to as "redditors") submit content to the ...


{{FLOSS Workflow applications Build automation