Why Software Developers Should Know More About Machine Learning

Machine Learning is about decision making. And like any learning it is about understanding things better. Simplified, in a machine learning project, we try to extract models from data. We hope that those models are understandable and computational.

But the purpose varies from analytics, over predictive modeling to control. And there is no way to find one method that fits all. Rule bases, trees, .. are understandable but to make them computational, you need to apply fuzzy variants and to make them more accurate and robust you should apply regularization techniques (numerically optimize fuzzy sets, membership functions, ...).

Neural nets are kind of black boxes, but computational.

To calculate control functions for given goal parameters you need to apply an inverse problem view - as you do it f you calibrate and recalibrate models.

What is the difference between doing a machine learning project and a development project?

In both yo usually deal with data. And data analysis as an exploratory approach should be the begin of anything. But machine learning experts have a different relationship with code than developers; think of the programming paradigms, derived from the best problem decomposition: data-, function-, or object-oriented. To create models automatically, you do not think so much about such things - they are inherent in the structure of the models ....

But more and more monstrous amounts of data are produced by machines and the best way to deal with all that information is by machines.

So, people who do large scale machine learning (like at Google) think similar to software engineers that need to design large scale industry type systems.

What about software analytics? 

Yes, there is another aspect. Modern analytic and model-based approaches in software engineering aim on the automatic generation of software from domain models, ... For this task program code is data.

We made some experiments with defect prediction in large industrial software systems (static core analysis) and found out that fuzzy decision tree methods fit quite well to this purpose.

The experiments have been made with our mlf.

So, software developers may enrich their development methodology with machine learning and apply machine learning in the software quality assurance process - wit a two-sided positive effect.