Our talks at PyCon DE/PyData Berlin
From 11th to 13th April, PyCon DE and PyData Berlin took place, the first time as a hybrid conference. Aside of attending presentations and tutorials, discussing with other experts and enjoying the sights and culinary specialities of Berlin, we ourselves had the chance to showcase some of our work and experiences in two talks.
Predictive Maintenance & Anomaly Detection for Wind Energy
Wind energy is one of the most promising possibilities for the decarbonization of the electricity grid. As wind turbines are usually located in remote areas and are often expensive to access, improving remote maintenance and diagnosis is crucial to future expansion of wind energy. Thus, extensive efforts are made to facilitate the efficient off-site supervision of wind turbines using diverse data recorded from hundreds of sensors that monitor the current state of each unit.
In the current iteration of these efforts, predictive maintenance techniques are used to model the normal behavior of multiple turbine components to automatically spot significant deviations from regular operation and notify diagnosticians. The goal of current development is to increase the level of automation to include diagnostic data from historical defects in order to accelerate diagnosis and actively learn from previous experience. To achieve this, challenges such as a high degree of heterogeneity, the rarity of defect events and the high diversity of defect types have to be overcome.
In this talk, I provide an overview of how EnBW employs machine learning techniques to detect anomalous behavior using its maintenance software. I also discuss the challenges that arise and upcoming solutions to these issues, which allow for a boost of both the economical and ecological efficiency of wind energy as we work towards a carbon-free future in the power sector.
Have a look at the slides
A Data Scientist’s Guide to Code Reviews
The standard code review process known from traditional software engineering does also apply to data tasks when these also follow traditional software engineering practices in large parts. Examples for such tasks are data extraction or transformation pipelines and machine learning (ML) services which run in production(-like) systems.
Yet, a significant amount of data science work is experimental, e.g. analysing data, preparing data for use with ML algorithms or training and evaluating ML models. From experience, code reviews are often skipped during these experimental tasks although they are still highly important in order to detect issues or errors early. Part of the reasons for skipping checks is that the focus of this work is not to produce production-grade code and rather to try out and verify a certain concept which can be put into production lateron, if successful.
Code reviews need to be adjusted to the changed requirements of data science work for them to still be effective. A lesser focus on code quality and more on the technical correctness and logic of the concept instead are key in these adjusted reviews. Thus, code reviews develop to some form of peer review, as they are known from the process of paper review.
Have a look at the slides
Tobias Hoinka – Data Scieneer
Alexandra Wörner – Data Scientist