Author

Contact

michal.klempa@gmail.com
michalklempa.com

Public profiles

github.com/michalklempa
hub.docker.com/u/michalklempa
linkedin.com/in/michal-klempa
twitter.com/KlempaMichal
stackoverflow.com/users/3944551/michal-klempa

Projects

Docker

Maintainer of own open-source NiFi Registry Docker Image with 100K+ downloads.

NiFi

Code contributing to Apache NiFi.

My projects using NiFi:

  • ETL pipelines from various sources to Kafka with Avro encoding
  • ASN.1+BER/DER/PER parsing and transforming pipeline – including development of custom Processors for NiFi
  • SNMP data collection, transformation and storage
  • SMS notification system based on NiFi, MySQL, gammu
  • Unity3D model conversion automation, using S3 buckets as source and destination of models.

Developed rule-based customer notification engine. Engine is based on Apache Flink with Apache Avro as state serialization backend. Runs in Docker Swarm cluster. Multiple stream joins are performed before rule evaluation occurs.

Knowledge and experience

Stream Processing

Apache Kafka and Confluent Platform - installation, configuration. Secure Kafka with Kerberos.
Kafka integration with NiFi, Flink and Spark.
Stream processing with Flink and Spark. Protocol buffers and Avro as in-transit format.
Apache NiFi: ETL pipelines with NiFi, Kafka, Flink, Schema Registry and ElasticSearch.

DevOps

Deployment: Docker Swarm, Kubernetes, Ansible, Vagrant, Terraform
CI/CD: Jenkins Pipelines, Gitlab CI, Artifactory.
Cloud: AWS (ECS, EMR, S3, VPC, 53, EC2), Google Cloud (GCE, GKE), Azure (HDInsight, AKS), Digital Ocean

BigData

Data pipelines: Hadoop, Spark, NiFi, Hive, Pig, sqoop. Data stewardship using Zeppelin with Scala and Spark. Hadoop administration: Apache Ambari/Cloudera Manager, Hadoop installation, tuning HDFS, YARN and securing Hadoop with Kerberos and LDAP/AD integration. Certified for Cloudera and Hortonworks Hadoop distributions.

Background

Experience with programming custom OS kernel (C, assembler) running on simulated CPU (MIPS R3000).
Java server-side technologies: Hibernate, myBatis, Spring. RDBMSs: MySQL, PostgreSQL.
Java frontendframeworks: Vaadin, Spring MVC, Swing.

Certificates