{"id":1989,"date":"2020-10-04T17:45:20","date_gmt":"2020-10-04T21:45:20","guid":{"rendered":"http:\/\/sys4seq.com\/?p=1989"},"modified":"2022-06-22T16:41:01","modified_gmt":"2022-06-22T20:41:01","slug":"building-a-real-time-big-data-pipeline-8-spark-mllib-regression-r","status":"publish","type":"post","link":"https:\/\/sys4seq.com\/index.php\/2020\/10\/04\/building-a-real-time-big-data-pipeline-8-spark-mllib-regression-r\/","title":{"rendered":"Building a real-time big data pipeline 8: Spark MLlib, Regression, R"},"content":{"rendered":"<p>Apache Spark MLlib\u00a0is a distributed framework that provides many utilities useful for machine learning tasks, such as: Classification, Regression, Clustering, Dimentionality reduction and, Linear algebra, statistics and data handling. R is a popular statistical programming language with a number of packages that support data processing and machine learning tasks. To address R\u2019s scalability issue, the Spark community developed SparkR package\u00a0<sup id=\"fnref:4\" role=\"doc-noteref\"><\/sup>which is based on a distributed data frame that enables structured data processing with a syntax familiar to R users.<\/p>\n<p><a href=\"https:\/\/adinasarapu.github.io\/posts\/2020\/10\/blog-post-sparkr-mllib\/\">&gt;&gt;&gt;<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"Apache Spark MLlib\u00a0is a distributed framework that provides many utilities useful for machine learning tasks, such as: Classification, Regression, Clustering, Dimentionality reduction and, Linear algebra, statistics and data handling. R is a popular statistical programming language with a number of packages that support data processing and machine learning tasks. To address R\u2019s scalability issue, the","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_mi_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0},"categories":[44,43],"tags":[81,58,56],"_links":{"self":[{"href":"https:\/\/sys4seq.com\/index.php\/wp-json\/wp\/v2\/posts\/1989"}],"collection":[{"href":"https:\/\/sys4seq.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/sys4seq.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/sys4seq.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/sys4seq.com\/index.php\/wp-json\/wp\/v2\/comments?post=1989"}],"version-history":[{"count":1,"href":"https:\/\/sys4seq.com\/index.php\/wp-json\/wp\/v2\/posts\/1989\/revisions"}],"predecessor-version":[{"id":1990,"href":"https:\/\/sys4seq.com\/index.php\/wp-json\/wp\/v2\/posts\/1989\/revisions\/1990"}],"wp:attachment":[{"href":"https:\/\/sys4seq.com\/index.php\/wp-json\/wp\/v2\/media?parent=1989"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/sys4seq.com\/index.php\/wp-json\/wp\/v2\/categories?post=1989"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/sys4seq.com\/index.php\/wp-json\/wp\/v2\/tags?post=1989"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}