Next: Senior Big Data Engineer

Create a free account to apply in seconds

Responsibilities:

• Design, Develop and Implement Big data engineering projects in Hadoop ecosystem.

• Engineer solutions with Cloudera, MapR or HDP for both batch & streaming data with high quality and with a sense of urgency.

• Develop application and custom integration solutions using spark streaming and Hive.

• Understand specifications, plan, design and develop software solutions, adhering to process – either individually or collectively within a project team

• Work in state-of-the art programming languages and utilize object-oriented approaches in designing, coding, testing and debugging programs.

• Work with support teams in resolving operational & performance issues

• Selecting and integrating any Big Data tools and frameworks required to provide requested capabilities

• Integrate data from multiple data sources, Implementing ETL process using APACHE NIFI

• Monitoring performance and advising any necessary infrastructure changes

• Management of Hadoop cluster, with all included services such as Hive, HBase, mapReduce and Sqoop

• Cleaning data as per business requirements using streaming API’s or user defined functions.

• Build distributed, reliable and scalable data pipelines to ingest and process data in real-time, defining Hadoop Job Flows.

• Managing Hadoop jobs using scheduler.

• Apply different HDFS formats and structure like Parquet, Avro, etc. to speed up analytics.

• Work with various hadoop ecosystem tools like Hive, pig, Hbase , spark etc.

• Reviewing and managing Hadoop log files.

• Assess the quality of datasets for a hadoop data lake.

• Fine tune Hadoop applications for high performance and throughput.

• Troubleshoot and debug any Hadoop ecosystem run time issues.

Being a part of a POC effort to help build new Hadoop clusters

Education:

Bachelor’s Degree or higher in Computer Science, Information Systems or related engineering disciplines

General Knowledge, Skills & Abilities

• Be a good detail-oriented data engineer

• Systematic and organizational skills important.

• Willing to commit for completing deliverable on time.

Preferred Qualifications:

• Must have experience with Spark, Hive, Scala or py spark.

• Preferred experience in one of the following technologies: Nifi, Kafka, or any other streaming technologies.

• 3+ years experience in data engineering building ETL pipelines using JAVA or Python or Scala

• Should be good at Pig, HIVE scripting.

• Solid understanding of HDFS is important.

• Work experience within a Data Warehousing/Business Intelligence/Data analytics group, and have hand’s-on experience with Hadoop

• Create tables/views in Hive or other relevant scripting language

• Have experience with Agile development methodologies

• Experience with NoSQL databases, such as HBase, Cassandra, MongoDB.

Experience Architecting Solutions Utilizing any of the following:

• JAVA or Python or Scala programming languages

• Nifi, Kafka-topics, or any other streaming technologies

• Parquet/Avro/ORC/XML/JSON/ORC/CSV/TXT formats

Location: Jacksonville, FL

Apply for this position

Skills

PythonJavaScalaMongoDBSparkKafka