Last year, ByteSumo published a detailed presentation explaining the detailed workings of the trendcaluclus method which describes how the calculations work, and some of the things we’re trialling in terms of using it as a feature generator for machine learning. This was the first version of the document released to the general public as a […]
Archive | Blog
Mastering Spark for Data Science
We are proud to announce the publication of “Mastering Spark for Data Science”, of which our CEO Andrew Morgan was one of the authors. The book is a whirlwind of expertise, and aimed at highlighting methods that represent the art of the possible with Spark. Antoine, Dave and Matt, the other authors, are all incredible […]

TrendCalculus version 1.0 released.
I am happy to announce that last week we released our new algorithm, TrendCalculus v1.0, and can confirm the release is running for other users as intended. The code can be found here on our public repo. It is published under a GPLv3 license. The code package delivers a fast command line tool for […]
TrendCalculus: A data science for studying trends.
Our new algorithm, TrendCalculus, was demonstrated at the London-Big-O Meet Up in December, and it generated some wide interest. For all those who asked, I’ve uploaded a copy of the presentation (available in the viewer below), and I believe it is cross posted to the meet-up site too. If you have questions, want to […]
Data Science Links
I was asked to share some of my data science bookmarks. Here they are, in no particular order: UPDATE: I got in an awesome link that betters my list. Check it out if my list isn’t long enough :0 Cloud Based Machine Learning Azure ML BigML Google Prediction API Ersatz Yottamine Skytree Zementis logicalglue Machine […]
Data Science is not an island
Can data science transform your business? If so, how do you organise it to deliver on that promise? My thoughts are: Data Science is not an island. Its potential to transform organisations comes when it collaborates effectively with the other critical functions needed to drive data-led innovation. Based on our experience, there are three […]
How to draw Enterprise Data Models
A great Enterprise Data Model should be one your organisation is proud to frame and hang on the wall. Traditionally enterprise data models have been really poorly drawn. It is one of the reasons that data science is said to be sexy, and data architecture is not. Data Scientists are investing effort in having […]
Hadoop EU Summit: Amsterdam 2014
We are happy to announce we are going to be at the Hadoop EU Summit in Amsterdam, April 2-3rd, 2014. Behind the scenes, we have been helping to make it a great event. Andrew, our CEO, was on the selection committee for the Business Applications track, along with his long time collaborator Evan Smith. Together […]
ByteSumo backs “Science to Data Science”
We are proud to announce our support for Science to Data Science. It is an organisation offering a conversion course for analytical PHD grads who would like to become commercial data scientists. As well as being sponsors, we are also on its organising committee. The course will be held August, 2014, in London. An exciting […]