ML&DL Bootcamp Berlin- Recommender System

How To Build Recommender System With Machine Learning And Deep Learning

Do you know what is the most requested topic in machine learning and deep learning in Berlin? 

There are numerous e-commerce companies are based in Berlin, there are numerous job opening to hire data scientists to build a recommender system for their platform?

Learn how to build recommender systems from our trainer from London. Stylianos Kampakis spent over eight years at teaching, training coaching Data Science, Machine Learning and Deep Learning.

Automated recommendations are everywhere – on Netflix, Youtube, Zalando app and on Amazon. Machine Learning Algorithms learn about your unique interests and show the best products or content for you as an individual.

 

However, do you actually know how does the recommender system work? Do you know there are several ways to build a recommender system? Do you want to learn all of them? Recommender systems are complex, but it is for sure to be able to start in 1-2 days.

Learn a hands-on; you’ll develop your own framework for evaluating and combining many different recommendation algorithms together, and you’ll even build your own neural networks using Tensorflow to generate recommendations from real-world cases.

We’ll cover:

  • Building a recommendation engine
  • Evaluating recommender systems
  • Content-based filtering using item attributes
  • Neighborhood-based collaborative filtering with user-based, item-based, and KNN CF
  • Model-based methods including matrix factorization and SVD
  • Applying deep learning, AI, and artificial neural networks to recommendations
  • Real-world challenges and solutions with recommender systems
  • Case studies
  • Building hybrid, ensemble recommenders

Who should join? 

  • Software developers interested in applying machine learning and deep learning to product or content recommendations
  • Engineers working at, or interested in working at large e-commerce or web companies
  • Computer Scientists interested in the latest recommender system theory and research
data-driven PM

Data-Driven Product Management

Every product manager talks about data-driven product management but what is the real explanation of it?

Product decisions are used to be based on product managers/owners and C-Level executives’ desires and instincts. What is the main driver of these instincts; customer feedbacks, competitive market intelligence or digital analytics results? The answer should be all of them. In this article, I will try to explain crucial data sources and which metrics should be considered in these data sources to make right product decisions.

Successful products belong to customers more than product managers. But it doesn’t mean product managers should only rely on customers’ needs and demands. Sometimes even customers don’t know what they want and need. Product managers must validate their customer feedbacks with different data sources.

 

Adapted Innovation Engine Business Model to Product Management Framework

As stated in the chart, Product Management should consider the data from Market Research, CMI (Competitor Market Intelligence), Internal Feedbacks, Digital Analytics and Customers’ Feedbacks to make data-driven decisions and create a value-added information. Sounds easy but assessing the value of the information and tying them to product roadmap needs an effort. Let’s deep dive into them one by one, and I will try to explain how I am using this model in my product management efforts.

 

Check our trainer’s Taner Akcok’s article on Toward Data Science and Medium and apply our Data-Driven Product Management Bootcamp.

Data-Driven Product Management Bootcamp for Decision Makers 

Data Product Management Bootcamp for Decision Makers

 

The workshop will help product managers level up their product leadership skills. Expect to work through collaborative exercises alongside other smart, creative product leaders who want to enhance their skills at both identifying the right products and features and building the support to do so.

What are the responsibilities of Data PM:

  • Build new products by organizing teams of ML researchers and engineers who build, optimize, scale, and integrate new research results.
  • Work closely with analytics-focused data scientists to track user behavior
  • Bring to machine learning (ML) focused data scientists and test data-driven modeling solutions.
  • Enable data engineers to build the right infrastructure and deliver robust data pipelines.

What you will learn:

  • Key analytic technologies and techniques, e.g. predictive modeling and clustering, and how these can play a role in managerial decision making
  • How to effectively manage the analytical processes and use the results of these processes as the basis for making informed, evidence-based decisions
  • How companies can use analytics as the basis for creating value

You will learn the tools and techniques to become a data-driven or “evidence-based” manager.

Which companies have already applied?

Google, Airbnb, and Skyscanner have all applied machine learning into every step of their user experience.

If you’re excited to tackle problems like these on top product teams, join us and pre-register. Data PM is the next for your career and company move. Some companies is starting to hire Data PM. The trainer is experienced Data PM in several companies and have sold out 2 startups in SF and an experienced trainer and implemented data-driven product management successfully.  In Silicon Valley, New York, London already started to train their employees, Beyond Machine will be the first one which offers the data-driven product management training.

If you are interested in private corporate training. Please contact us individually.

What Happens When You Hire a Data Scientist Without a Data Engineer? Guest Post by Vladislav Supalov

Hey Folks,
Vladislav‘s photo
I’m Vladislav! If you care about AI, machine learning, and data science, you should have heard of data engineering. If you haven’t, or would like to learn more – then this is *exactly* for you. Helping companies to make use of their data is a fascinating topic! I’ve spent quite a bit of time building MVP data pipelines and would like to help you avoid one of the worst mistakes you can make when starting out on a serious project.
Having solid data plumbing in place is pretty darn important if you want to work with company data without wasting time and money. The natural train of thoughts when people want to make use of data “the right way”, usually ends at “we should hire a data scientist”.
That’s a mistake in almost every case. You need to take care of data engineering before that. Here are a few of my favourite pieces of writing on the topic:

What Happens When You Hire a Data Scientist Without a Data Engineer

This one is brief but worth a read. The most important points made, is the wasted time and an observed high tendency for a data scientists who are not given the right tools to quit.
A complete story of getting an analytics team up and running within 500px. Samson did a lot of stuff right, which is admirable. Take note of the tech choices, Luigi, in particular, to get data into a data warehouse. A great example of a well-thought-out way to work with data. One of the major mistakes he points out: not putting enough effort into data evangelism.

Your Data Is Your Lifeblood — Set up the Analytics It Deserves

An utterly amazing interview, full of great advice. I especially love that he points out that you should take care of making both event and operational transaction data available. Only if you combine them, you have a complete picture.
A very long interview with the Head of BI at Stylight. Konstantin did an impressive job in his first year and shares a lot of insight. This is not exactly about data engineering but on the topic of giving a company access to data and how to approach it. One of the most important takeaways for me was his advice to secure a small win for as many people as possible in the company when starting out. There are a lot of low-hanging fruits and you get the best ROI and a lot of goodwill from making them available.
Hope you’ll get a lot of value from those articles! If you want to learn more about data engineering, data pipelines and the stuff I do, scroll to the bottom of the last article and subscribe to Beyond Machine and  Vladislav‘s mailing list.

The Secret Behind one of the biggest online marketplace “OLX Group”  How do they utilize their data and machine learning algorithm? 

OLX Group is a global online marketplace operating in 45 countries and is the largest online classified ads company in India, Brazil, Pakistan, Bulgaria, Poland, Portugal, and Ukraine. It was founded by Alec Oxenford and Fabrice Grinda in 2006.

A platform that connects buyers and sellers in more than 40 countries and has hundreds of millions of customers per month faces many challenges that are to some extent similar but also somewhat different to online retail.

There are main 3 challenges: 

Challenge 1: User experience. When the user navigates the platform and what are the recommendations and the results when doing searches, etc.

Challenge 2: Identifying what is that makes some advertisements much more liquid (easy to sell) than others.

Challenge 3: The reminder after purchasing the items. Predicting if an item is sold 15 days after its entry into the system.

 

As part of the solution for a good user navigation and browsing experience, it is useful to have a good estimate if a specific advertisement has been already sold so that we don’t show it again in the recommendation or search output. This is a probabilistic time-series prediction problem. Another important aspect connected to the previous case is identifying what is that makes some advertisements much more liquid (easy to sell) than others. For this particular case, understanding how the model is making decisions is really important as the outcome can be provided to the sellers in order to improve the liquidity of their advertisements. For the reminder of we will focus on this specific liquidity prediction problem, predicting if an item is sold 15 days after its entry in the system, and we will use XGboost and eli5 for modelling and explaining the predictions respectively.

 

 

XGboost is a well known library for “boosting”, the process of iteratively adding models in an ensemble of models that target the remaining error (pseudo-residuals). These “week learners” are simple models and are only good at dealing with specific parts of the problem space on their own, but can significantly reduce bias while controlling variance (giving a good model in the process) due to the iterative fitting approach followed in constructing this type of ensemble. The data we have available for this problem include textual data (the title and textual description of the original advertisement, plus any chat interactions of the seller and potential buyers), as well as categorical and numeric data (the category of the advertisement, the brand and model of the item, the price, number buyers/sellers interactions for each day after the entry, etc.). The data sample we are using here is a relatively small part of the data from some countries and categories only, so in many of its properties it is not representative of the entire item collection. Nevertheless, let’s start with some basic data munging.

 

The histogram of the day that an item was sold is shown above. We can easily see that most items are sold in the first days after the respective advertisement is placed, but there are still significant sales happening a month later as well. With respect to the day an advertisement is added to the platform, we can see that there is a peak on weekends, but other days are roughly at the same level. Finally, with respect to the hour, an advertisement is added to the platform, we can see in the figure below that there is a peak around lunchtime, and the second peak after work hours. One way to capture more complicated relations is to use the pairplot functionality of the seaborn library. In this case we will get the combinations of scatterplots for the selected columns, while in the primary diagonal we can plot something different, like the respective univariate distributions. We can see that the number of buyers interaction in the first day is a strong predictor if an item will be sold early or late. We can also see that category id is very important predictor as well, as some categories in general tend to be much more liquid than others. Now that we are done with the basic data munging we can proceed to make a model, using the XGboost library.

 

Using a hyperparameter optimization framework we can find out the hyperparameters that work best for this data. Since we are interested also on the output confidence of the prediction itself (and not only on the class), it is typically a good idea to use a value for min_child_weight that is equal or larger than 10 (given that we don’t loose in predictive performance) as the probabilities will tend to be more calibrated. The ranking of the features from the XGboost model is shown in the figure above. Although feature ranking from tree ensembles can be biased (favoring for example continuous or categorical features with many levels over binary or categorical features with few levels) and in addition if features are highly correlated the effect can be splitted between them in non-uniform way, this is already a good indication for many purposes. Now we select one specific instance at prediction time. Using eli5 we get an explanation of how this instance was handled internally by the model, together with the most features that where the most important positive and negative influences for this specific sample.

 

As we can see the sample was classified as being liquid, but still there was some pull down from the text properties (title length, title words, etc) which we can use to provide guidance to the seller for improving the advertisement.

How to land the sexiest job- Data Scientist in 2018?

We have conducted some research online that we found out the major obstacles and challenges to work as data scientist or to switch a career to be the data scientist?

#Don’t know where to start or where to learn?

#No Network to find a job or company you like?

#No past experience working as the data scientist? Even you know how to write the algorithm?

Here come the 8 Essential Tips to follow

1. Set your goal first

Looking for new challenging jobs? Just enjoy algorithm or mathematics? Want to start your own startup? Have a special problem, want to look for the solution? Or just to have a better job with more money?

2. Find someone who went through the path that you are interested in, then ask him/her to be a mentor?

It is like running a startup, it takes a lot of energy, time, dedication, motivation, discipline, sometimes even amount of money. You can learn all the best deep learning courses from those famous professors and experts all over the world, however, when it comes to soft skill, you need a mentor to help you, if he/she has even extensive networks, that is even better.

3. Use online education and make a plan to learn and study

Needless to say Andrew Ng’s deep learning courses at Coursera, partnered with NVIDIA Deep Learning Institute.

Andrew Ng is one of the best instructor and teacher I have ever heard and seen, he is perfect articulated, patient, and full of passion to teach as well. I personally took a couple of his classes online at Standford. (the course was really simple enough for anyone who is determined to start!

Btw, I sucked at Linear Algebra during high school 🙁

If you like more entertaining ones, the Youtube star: Siraj Raval also has Youtube channel and Udemy courses online to follow and learn, his goal is to use machine learning to build anything you can be dreamed of. After enough time for studying and online learning, it is always great to apply to the topic you are especially interested.

Don’t forget the traditional way, reading a book: The Bible of the deep learning which was written by Ian Goodfellow and Yoshua Bengio and Aaron Courville.

Certainly, if you like more personal touch, we offer Deep Learning Bootcampin January in Berlin with the topic of NLP, time series forecasting, and computer vision. Our instructors are from IBM and the famous research institute Max Planck Institute. Romeo Kienzler, he also has Youtube channel and is one of the most successful Deep Learning instructor in Europe.

4. Join some online communities and also offline to help you to learn and engage also receiving some feedback to improve

Online: GithubCodeacademyKaggleKD Nuggets

Offline: MeetupEventbrite

5. Choose a Tool / Language and stick to it

This is one of the most common questions for beginners. We can have one more article to go deeper into that in the future. For now, choose a language that you are the most familiar with or the simplest for you. If you are completely new, then choose in between the data handling capabilities, advancements in tool, career chance, deep learning support. From my personal experience, I was consulting a machine learning startup with one Python scientist, one R scientist, every time when I had problem with data, I asked Python scientist to write me a quick code to fix my problem while R could not really deal in a such an easy way. Python becomes love of my life for now.

6. Focus on real cases and applications not just learning theory

Think about some scenario you want to work on, music recommendation, Game of throne’s ending? Parking problem? Predict stock market? (Check out our DeepLearning Bootcamp on Eventbrite)

7. Don’t forget machine learning and deep learning it comes down to your mathematics capacity.

Warning: do not just take deep learning and NLP courses, but forget math is the real foundation. Take also some linear algebra courses also or at least linear algebra.

8. Finally, you can start the network but always remember, play smart.

Don’t waste too much energy and time on it, especially a lot of them come with free alcohols…Check the topic before you attend also attendees, are they your target audiences or not really? Or sign up some recruiting event, there will be tons of recruiters and HR managers from companies. (We are launching our First Recruiting Day focused on the data team recruitment in March 2018)

P.S. M.I.E is currently re-branding to Beyond Machine, our mission to connect and boost AI ecosystem through connecting and training. We will launch series events in 2018, following with Deep Learning Bootcamp, Data Analytics Workshop, Recruitment Day, Self-Driving Workshop, Machine Learning Week. Furthermore, we will launch also outside of Berlin and bring more corporation and partnership exchange in and outside of Germany. If you are interested in any kind of sponsorship, partnership, just simply drop us a line.

 

Build it, Train it, Test it!

 

7 reasons you should join us — the world first OPEN AIR AI summit in Berlin

 

WHY SHOULD YOU COME?

  1. The statistic shows the size of the global market for artificial intelligence for enterprise applications, from 2016 to 2025. In 2016, the enterprise AI market is estimated to be worth around 360 million U.S. dollars worldwide.

2. We are the world FIRST OPEN AIR MACHINE INTELLIGENCE SUMMIT, we had two last evening summits in rooftop bars in Berlin and Paris and this summer will be held in a Biergarten. We know boring conference rooms kill creativity.

AI market estimation from till 2015

3. We can awesome speakers confirmed to talk on the stage.

  • Reiner Kraft, VP@Zalando
  • Alex Housley, CEO@Seldon
  • Romeo Kienzler, Chief Data Scientist@IBM
  • Claudio Weck, Head of Data Science@MHPLab: A Porsche Company
  • Johannes Schaback, Co-founder@Visual Meta
  • Ulrike Franke: Drone&warfare scholoar@Oxford

4. We have mentoring session hosted by TechstarTechstar SAPRockstar and other exciting accelerators/incubators.

5. We also offer workshops session hosted by IBM Watson Chief Data Scientist, Romeo Kienzler, he is also instructor at Coursera.

The description of his workshop at M.I.E Summit Berlin.

6. We are INTERNATIONAL, we are the first media&community from Berlin expanding outside of Berlin and had speakers from London, Paris, NYC, Amsterdam. Zurich…

More info, please visit our website.

7. Early bird tickets are going to be out soon! Hurry up before the end of May.

MIE Summit

The founding story behind Beyond Machine (rebranded from M.I.E)

Beyond Machine (rebranded from M.I.E) was spawned from Lele and Irene constant frustration during the founding of their AI startups. In the end of 2015, they both left their jobs at rising mobile ad tech and product companies. Lele first started SoCrowd and pivoted to Deckard after 3 months. Irene wanted to tackle the challenges of visual recognition.

It quickly became apparent that there was a need for a more developed community an outlet for media around Machine Intelligence. After running a fruitful and inspiring Evening Summit, they decided to take Beyond Machine. to the next level, founding a media company.focus on the training, re-education, and networking in the field of AI and innovative technology. Irene decided to leave FindEssence, the first company she co-founded and push forward the growth of Beyond Machine.

 

Beyond Machine’s mission: To connect people in AI industry globally, bringing profound and engaging content and to start a conversation about job substitution issues.

The blurb of M.I.E Summit 2017:

M.I.E. Summit Berlin 2017 is the World’s first open-space machine intelligence summit, which will be held on the 20th of June 2017.

This event will give you the opportunity to learn, discuss and network with your peers in the MI field. Backdropped in one of Berlin’s most vibrant and artistic locations, break free from traditional conference rooms and share a drink in a typical Berliner Biergarten.

The M.I.E Summit Berlin 2017 will provide you with two in-depth event tracks (keynotes, workshops, and panels) as well as over 25 leading speakers and unparalleled networking opportunities.

 

Agenda is announced on the website