Building a real-time big data pipeline 2 : Spark Core, Hadoop, Scala

Apache Spark is a general-purpose, in-memory cluster computing engine for large scale data processing. Spark can also work with Hadoop and its modules. The real-time data processing capability makes Spark a top choice for big data analytics. The spark core has two parts. 1) Computing engine and 2) Spark Core APIs. >>>
read more

Building a real-time big data pipeline 1 : Kafka, RESTful, Java

Building a real-time big data pipeline 1 : Kafka, RESTful, Java Updated on September 20, 2021 Apache Kafka is used for building real-time data pipelines and streaming apps. Kafka is a message broker, which helps transmit messages from one system to another. Zookeeper is required to run a Kafka Cluster. Apache ZooKeeper is primarily used
read more

Quantitative proteomics : TMT-based quantitation of proteins

Quantification of proteins using isobaric labeling (tandem mass tag or TMT) starts with the reduction of disulfide bonds in proteins with Dithiothreitol (DTT). Alkylation with iodoacetamide (IAA) after cystine reduction results in the covalent addition of a carbamidomethyl group that prevents the formation of disulfide bonds. Then, overnight digestion of the proteins using trypsin or trypsin/LyC
read more

Quantitative proteomics: label-free quantitation of proteins

Liquid chromatography (LC) coupled with mass spectrometry (MS) has been widely used for protein expression quantification. Protein quantification by tandem-MS (MS/MS) uses integrated peak intensity from the parent-ion mass (MS1) or features from fragment-ions (MS2). Label-free quantification (LFQ) may be based on precursor ion intensity (peak areas or peak heights) or on spectral counting. Here, the
read more