Paul Graham at yCombinator

yCombinator 2014 Data Science Start-ups

There are new and exciting commercial opportunities in the data science space. We take a look at the data science start-ups from the latest yCombinator batch.


Customer analytics

Framed Data.Framed Data Framed Data is a good example of a new data science company. Using machine learning to predict customer churn. An attractive problem to attack, since no company wants to lose customers.

We wish them much luck with their bigger mission: To turn all companies into data-informed organizations.

NextCaller. NextCaller is an advanced caller identification platform. If a caller matches a record in your CRM, that profile is shown on-demand. If the caller does not match a record in your CRM, a general profile will be shown, made from the database of all clients.

Sharing a CRM like this with many other companies is a huge benefit: one makes use of far more data. Exciting to see how this model/approach can be adapted for other businesses.

Taplytics. Taplytics promises a fully-featured A/B testing platform for mobile apps. Use their new dashboard to receive stats on: acquisition performance, user engagement and user retention.

Mobile app developers will want to A/B test their apps. If they don’t they are leaving food on the table. Taplytics is an attractive platform for developers who do not want to worry about implementing A/B testing on their own.

Information retrieval

Algolia. Algolia is an API to create search engines for any site. Powerful tools enable developers to create search engines in a few clicks. yCombinator’s internal search is powered by Algolia. We’ve tried their API, but found their results to be a little too fuzzy for our purpose. Fuzziness can of course be a strength: an intelligent search engine will also search for semantically and syntactically related terms. For NLP tasks we prefer stricter results.

Many searchers already use Google to navigate websites. Where internal search could shine is in customization and personalization: A webshop could rank more profitable products higher, or target the results to the searcher’s purchase history.

Online marketing

BoostableBoostable. Boostable makes online advertising easy and effective for anyone. Connect your store and set up your preferences, and your ads will automatically be shown to the right customers.

Improving conversion rates directly shows itself in improved profits. As long as Boostable can keep showing uplifts in visitors and sales, it will remain an interesting value proposition to advertisers.

Orankl. Orankl connects reviews with marketing emails. With very little effort you can add user tracking and review forms to your website. This information is then processed to generate marketing emails with product recommendations tailored to each user.

Personalization is the next step for conversion optimization. Instead of optimizing for everyone (which is bound to fail for some), optimize for each user’s data profile.

SendWithUs. SendWithUs integrates your email service providers and has built-in support for A/B tests and extensive analytics.

Where SendWithUs distinguishes itself from other mail providers is by providing support for drip campaigns and allowing the marketer to manage email campaigns without the aid of a developer.

StackLead. StackLead connects multiple datasources to give you insight in your new customers. From just a name and an e-mail address they try to find data and metrics like: industry, title, number of employees.

If use of StackLead leads to closing more deals and warmer leads, they’ll be in business for a very long time.

Data dashboards

AbacusAbacus. Abacus solves the hassle around business expenses. Soon after expenses are approved the employee will get paid. Managers and accountants get an overview of spending by employee, category, project, vendor and location. Syncing with bookkeeping software is also possible.

We think Abacus is likely to disrupt the back office. It solves an inefficiency and benefits both the employers and employees.

Rocketrip. With Rocketrip you reduce spending by rewarding employees for saving. Combining custom settings and real-time market data, employees can book hotels and flights, or rent cars, all within the companies budget.

Much like Abacus, the value for a business is clear: less hassle and saved expenses.

Ambition. Ambition is a start-up in company metrics. It integrates a wide variety of platforms (phone, spreadsheet, CRM). With it’s own Ambition metric it aims to capture employee productivity metrics.

An ambitious problem to attack with a lot of potential for growth.

42Technologies. 42Technologies is the broadest data science application of this batch. It offers data analysis for retailers, with a focus on: recommendations, performance, customer analytics, growth, and ROI optimization.

With a great team and a lot of dedication to let retailers make better data-informed decisions, we have little doubt that they’ll make a dent.

Tracking & Analytics

BellabeatBellabeat. Bellabeat’s hook is to use your phone to listen to your baby’s heartbeat. After that it keeps their users engaged tracking their baby’s movement and pregnancy weight gain.

A good example of a niche where data applications can thrive. We don’t expect this market to be a passing fad. People will want statistics about every aspect of their lives, including their pregnancies.

PiinPoint. PiinPoint enables businesses to find the best locations for expansion. A clear value proposition for researching the markets and their competitors.

In a market with rich customers, if Piinpoint manages to onboard new users, they stand to make a lot of money for themselves and their customers.

Terravion. Terravion lets farmers subscribe to aerial photography: Select just the patches of land you want, and their airplane will start to take pictures. All data is conveniently available in an online dashboard.

Another company, The Climate Corporation, has already proven that huge buy-outs are possible in this market. Just securing a small slice of the pie will be hugely profitable.


CareMessageCareMessage. CareMessage offers simplified care management for hospitals and doctors. Use their dashboards to automate reminders, send personalized messages and start tailored programs.

We find these evidence-based health education programs very interesting: replicating the experience of interacting with a live health coach with a digital app.

Immunity Project. Immunity Project wins this list for the most noble goal: End HIV/AIDS and offer the cure for free. Talk about “making the world a better place”.

Their statistical analysis shows promising research opportunities. Let’s hope that the cure for AIDS becomes the biggest victory of data science to date.

TrueVault. With TrueVault one can store medical data in a HIPAA compliant manner.

TrueVault targets healthcare app developers. Working with and storing data becomes tricky when legal issues arise. Data security and data provenance will only become bigger topics in the future, allowing TrueVault plenty of opportunity for growth (perhaps also in other niches)

Data Mining

Kimono LabsKimono Labs. Kimono Labs turns website’s into API’s with a visual editor and a few clicks. No more messy scraping, with Kimono Labs one does not need to write any code.

Allowing customers to quickly get to the data they want is a great feature. Intelligent pattern extraction saves a lot of time, both for the novice and expert.


The Dating Ring. A standard dating site with a twist: using matchmakers to set up dates. The Dating Ring makes extensive use of algorithms for matching.

Fixing online dating was one of the ambitious start-up ideas that YC posed. Any site that can overcome the chicken-egg problem is a force to be reckoned with.


TradeblockTradeBlock. TradeBlock jumps into the BitCoin fray with a data analysis platform for digital currencies. TradeBlock offers dashboards to track mining, markets and research investing opportunities.

A no-brainer for YC. Digital currencies are a new phenomenon and this market definitely benefits from data. TradeBlock could become the final end-point for all statistics and analytics related to digital currencies.

Zidisha. Zidisha will use data science to improve its micro-financing platform. The non-profit Zidisha employs another YC startup, Bayes Impact, to make use of this data.

“We have enough data so that we can use data science to develop an algorithm to predict not just fraud but also credit risk.”

Machine Translation & NLP

UnbabelUnbabel. Unbabel brands itself as “translation as a service”. Combining machine translation with human translators they get better and faster results.

Adding the human element to machine learning is interesting. Instead of leaving it all up to some black-box algorithm, Unbabel places machine learning in service of the humans. offers natural language for the internet of things. In short it wants to answer questions posed in natural language, such as “what is the weather like tomorrow in SF?”, turning it into intent, parsing the locations and finding the right forecast.

As devices become smaller and smaller, and our need to use computers to remember facts increases, so does the need for an intelligent platform that understands natural language. In combination with improved speech-to-text in the near future we may wonder how we ever used those tiny keyboards.

Job Skills

What skills are these data science start-ups looking for? We look at their open job vacancies and see skills like:

  • Data visualization
  • Machine learning
  • Distributed systems
  • Familiarity with compliance & security standards including PCI DSS, FFIEC, GLBA, ISO 27001, HIPAA, and NIST
  • Well versed in JSP, JavaScript, JSON, XML (VXML & CCXML)
  • Experience with source code control tools
  • 2+ years of Java experience
  • Fluency in Python, Java, C++, or similar (Python strongly preferred)
  • Production experience with relational databases
  • Experience with distributed caching techniques
  • Solid foundation in data structures, algorithms and complexity analysis
  • Strong programming background in Linux
  • Passion for security, and a practical and balanced approach to security issues
  • Familiarity with AWS and MySQL
  • A knack for solving complex UI & UX problems
  • Experience building your own MEAN apps
  • Expertise in building clean, api driven code
  • Optimizing marketing strategies based on performance metrics.
  • from backend Python services to slick dashboard features in JavaScript.
  • Advise clients on strategies to meet their marketing objectives.
  • You’ve built and launched your own projects.
  • You have a Github account and you read Hacker News
  • You know what it means to build lean and iterate.
  • Node/Express
  • AWS
  • MySQL
  • AngularJS
  • Objective-C
  • Java Android SDK
  • 5 years of relevant work experience, including large systems software design and development experience, with knowledge of UNIX/Linux.
  • A Polyglot with experience applying multiple web development languages to live applications.
  • well-versed in a Python web stack
  • strong UI development experience using HTML, CSS and JavaScript/AJAX.
  • a solid foundation in computer science, you have strong competencies in data structures, algorithms, and software design.

The intro image is in the public domain and depicts Paul Graham speaking to a new batch of YC companies.

Leave a Reply

Your email address will not be published. Required fields are marked *