Things to Come From the Cloudera/Hortonworks Merger

Now that the two Hadoop distribution giants have merged, it is time to call out what will happen to their overlapping software offerings. The following are my predictions:

Ambari is out – replaced by Cloudera Manager.
This is a no-brainer for anyone that has used the two tools. People can rant and rave about open source and freedom all they want, but Cloudera Manager is light-years ahead of Ambari in terms of functionality and features. I mean, Ambari can only deploy a single cluster. CM can deploy multiple clusters. And the two features I personally use the most in my job as a consultant are nowhere to be found in Ambari: Host/Role layout and a non-default Configuration view.

Tez is out – replaced by Spark.
Cloudera has already declared that Spark has replaced MapReduce. There is little reason for Tez to remain as a Hive execution engine when Spark does the same things and can also be used for general computation outside of Hive.

Hive LLAP is out – replaced by Impala.
Similar to Tez, there is no reason to keep interactive query performance tools for Hive around when Impala was designed to do just that. Remember: Hive is for batch and Impala is for exploration.

What do you think? Leave your thoughts in the comments.

Failed Disk Replacement with Navigator Encrypt

Hardware fails.  Especially hard disks.  Your Hadoop cluster will be operating with less capacity until that failed disk is replaced.  Using full disk encryption adds to the replacement trouble.  Here is how to do it without bringing down the entire machine (assuming of course that your disk is hot swappable).

Assumptions:

  • Cloudera Hadoop and/or Cloudera Kafka environment.
  • Cloudera Manager is in use.
  • Cloudera Navigator Encrypt is in use.
  • Physical hardware that will allow for a data disk to be hot swapped without powering down the entire machine. Otherwise you can pretty much skip steps 2 and 4.
  • We are replacing a data disk and not an OS disk.

Steps:

The following are steps to replace a failed disk that is encrypted by Cloudera Navigator Encrypt.   If any of the settings are missing in your Cloudera Manager (CM), you might consider upgrading CM to a newer version.

  1. Determine the failed disk.  The example used here is a disk that is mounted at /data/0.
  2. Configure data directories to remove the disk you are swapping out:
    1. HDFS
      1. Go to the HDFS service.
      2. Click the Instances tab.
      3. Click the affected DataNode.
      4. Click the Configuration tab.
      5. Select Category > Main.
      6. Change the value of the DataNode Data Directory property to remove the directories that are mount points for the disk you are removing.

        Warning: Change the value of this property only for the specific DataNode instance where you are planning to hot swap the disk. Do not edit the role group value for this property. Doing so will cause data loss.

      7. Click Save Changes to commit the changes.
      8. Refresh the affected DataNode. Select Actions > Refresh DataNode configuration.
    2. YARN
      1. Go to the YARN service.
      2. Click the Instances tab.
      3. Click the affected NodeManager.
      4. Click the Configuration tab.
      5. Select Category > Main.
      6. Change the value of the NodeManager Local Directories property to remove the directories that are mount points for the disk you are removing.

        Warning: Change the value of this property only for the specific NodeManager instance where you are planning to hot swap the disk. Do not edit the role group value for this property. Doing so will cause data loss.

      7. Change the value of the NodeManager Container Log Directories property to remove the directories that are mount points for the disk you are removing.

        Warning: Change the value of this property only for the specific NodeManager instance where you are planning to hot swap the disk. Do not edit the role group value for this property. Doing so will cause data loss.

      8. Click Save Changes to commit the changes.
      9. Refresh the affected NodeManager. Select Actions > Refresh NodeManager.
    3. Impala
      1. Go to the Impala service.
      2. Click the Instances tab.
      3. Click the affected Impala Daemon.
      4. Click the Configuration tab.
      5. Select Category > Main.
      6. Change the value of the Impala Daemon Scratch Directories property to remove the directories that are mount points for the disk you are removing.

        Warning: Change the value of this property only for the specific Impala Daemon instance where you are planning to hot swap the disk. Do not edit the role group value for this property. Doing so will cause data loss.

      7. Click Save Changes to commit the changes.
      8. Refresh the affected Impala Daemon. Select Actions > Refresh the Impala Daemon.
    4. Kafka
      1. Go to the Kafka service.
      2. Click the Instances tab.
      3. Click the affected Kafka Broker.
      4. Click the Configuration tab.
      5. Select Category > Main.
      6. Change the value of the Log Directories property to remove the directories that are mount points for the disk you are removing.

        Warning: Change the value of this property only for the specific Kafka Broker instance where you are planning to hot swap the disk. Do not edit the role group value for this property. Doing so will cause data loss.

      7. Click Save Changes to commit the changes.
      8. Refresh the affected Kafka Broker. Select Actions > Refresh Kafka Broker.
  3. Remove the old disk and add the replacement disk.
    1. List out the disks in the system, taking note of the name of the failed disk. (lsblk; lsscsi)
    2. Determine the failed disk.  Example used here is /data/0 which is mounted at /navencrypt/0.  (readlink -f /data/0)
    3. Determine the Navigator Encrypt DISKID of the failed source device. (grep /navencrypt/0 /etc/navencrypt/ztab)
    4. Clean up Navigator Encrypt entries. (navencrypt-prepare --undo ${DISKID} || navencrypt-prepare --undo-force ${DISKID})
      1. Also possibly need to use: (cryptsetup luksClose /dev/mapper/0; dd if=/dev/zero of=${DISK}1 ibs=1M count=1)
    5. Remove failed disk.
    6. Add replacement disk.
    7. Perform any HBA configuration (i.e. Dell PERC/HP SmartArray RAID0 machinations).
    8. Determine the name of the new disk.  Example used here is /dev/sdo. (lsblk; lsscsi)
    9. Partition the replacement disk. (parted -s ${DISK} mklabel gpt mkpart primary xfs 1 100%)
    10. Have Navigator Encrypt configure the disk for encryption and write out a new filesystem. (navencrypt-prepare -t xfs -o noatime --use-uuid ${DISK}1 /navencrypt/0)
    11. Fix the symlink target directory installed by navencrypt-move. (mkdir -p $(readlink -f /data/0))
  4. Configure data directories to restore the disk you have swapped in:
    1. HDFS
      1. Change the value of the DataNode Data Directory property to add back the directory that is the mount point for the disk you added.
      2. Click Save Changes to commit the changes.
      3. Refresh the affected DataNode. Select Actions > Refresh DataNode configuration.
      4. Run the HDFS fsck utility to validate the health of HDFS.
    2. YARN
      1. Change the value of the NodeManager Local Directories and NodeManager Container Log Directories properties to add back the directory that is the mount point for the disk you added.
      2. Click Save Changes to commit the changes.
      3. Refresh the affected DataNode. Select Actions > Refresh NodeManager.
    3. Impala
      1. Change the value of the Impala Daemon Scratch Directories property to add back the directory that is the mount point for the disk you added.
      2. Click Save Changes to commit the changes.
      3. Refresh the affected DataNode. Select Actions > Refresh the Impala Daemon.
    4. Kafka
      1. Change the value of the Log Directories property to add back the directory that is the mount point for the disk you added.
      2. Click Save Changes to commit the changes.
      3. Refresh the affected Kafka Broker. Select Actions > Refresh Kafka Broker.

Reference Links:

https://www.cloudera.com/documentation/enterprise/latest/topics/admin_dn_swap.html

https://www.cloudera.com/documentation/enterprise/latest/topics/navigator_encrypt_prepare.html#concept_device_uuids

strict_variables and the RazorsEdge Puppet Modules

Over the past month I have been adding much needed support for running Puppet with strict_variables = true to all of the RazorsEdge Puppet modules. Thanks to coreone, I finally had a solution that did not require tearing out the legacy global variable support. As much as I think that continued inclusion of global variable support has become painful, I am still committed to keeping it around.

I also managed to get the Rspec testing Ruby gem dependencies configured such that things can still be tested on Ruby 1.8.7, 1.9.3, and 2.x as well as Puppet 2.7, 3.x, and 4.x. Travis-CI is also testing Ruby 2.4 and Puppet 5.x for all of the modules. As of now, only two modules are not passing the Puppet 5 Rspec tests and I hope to get those sorted soon.

https://forge.puppetlabs.com/razorsedge/certmaster
https://forge.puppetlabs.com/razorsedge/cloudera
https://forge.puppetlabs.com/razorsedge/func
https://forge.puppetlabs.com/razorsedge/hp_mcp
https://forge.puppetlabs.com/razorsedge/hp_spp
https://forge.puppetlabs.com/razorsedge/lsb
https://forge.puppetlabs.com/razorsedge/network
https://forge.puppetlabs.com/razorsedge/openlldp
https://forge.puppetlabs.com/razorsedge/openvmtools
https://forge.puppetlabs.com/razorsedge/razorsedge
https://forge.puppetlabs.com/razorsedge/snmp
https://forge.puppetlabs.com/razorsedge/tor
https://forge.puppetlabs.com/razorsedge/vmwaretools

Let me know if you have any feedback!

Hue Load Balancer TLS Errors

This is a reblog from the Clairvoyant blog.

If you are configuring the Hue load balancer with Apache httpd 2.4 and TLS certificates, there is a chance that you may end up with errors. The httpd proxy will check the certificates of the target systems and if they do not pass some basic consistency checks, the proxied connection fails.

Read more of my post on the Clairvoyant blog.

puppet cloudera module 3.0.0

This is a major release of my Puppet module to deploy Cloudera Manager. The major change is that razorsedge/cloudera now supports the latest releases of dependent modules. razorsedge/cloudera was lagging behind due to the need to support Puppet Enterprise 3.0.1 installations and only recently did those installations finally upgrade.

Notable changes are:

https://forge.puppetlabs.com/razorsedge/cloudera
https://github.com/razorsedge/puppet-cloudera

Let me know if you have any feedback!

puppet cloudera module 2.0.2

This is a minor bugfix release of my Puppet module to deploy Cloudera Manager. When I released the module, I had assumed that the testing I did for the C5 beta2 would be 100% valid for C5 GA.  It turns out that Cloudera shipped a newer version of the Oracle 7 JDK and a symlink that the module creates on RedHat and Suse (/usr/java/default) was pointing at the wrong location.  Upgrading to razorsedge/cloudera 2.0.2 will fix the issue.

Lesson learned: Test, test, and test some more.

Thanks to yuzi-co for reporting the problem.

https://forge.puppetlabs.com/razorsedge/cloudera

https://github.com/razorsedge/puppet-cloudera

Let me know if you have any feedback!

puppet cloudera module 2.0.1

This is a major release of my Puppet module to deploy Cloudera Manager. The major change is that razorsedge/cloudera now supports Cloudera’s latest release, Cloudera Enterprise 5, which adds support for Cloudera Manager 5 and Cloudera’s Distribution of Apache Hadoop (CDH) 5. Additionally, this module and it’s deployment via Puppet Enterprise 3.2 has been certified by Cloudera to be tested and validated to work with Cloudera Enterprise 5.

Cloudera Certified This module is certified on Cloudera 5.

Other changes are:

  • All interaction with the cloudera module can now be done through the main ::cloudera class, including installation of the CM server. This means you can simply toggle the options in ::cloudera to have full functionality of the module.
  • Official operating system support for Debian 7.
  • Installation of Oracle JDK 7.
  • Recommended tuning of the vm.swappiness kernel parameter.
  • Installation of native LZO libraries when the parameter install_lzo => true is selected, even when installing via parcels.
  • Conversion of the README.md file to the Puppet Labs recommended README.markdown formatting.  This has dramatically improved the presentation of the things one needs to know about the module in order to quickly become productive.
  • Taking advantage of the new module metadata to add compatability information to the module page on the Puppet Forge.

If you have not seen the previous changes in version 1.0.1, here is a recap:

  • Allow for use of an external Java module. Not everyone will want to stick with the older version Oracle JDK that Cloudera ships in their software repositories. If you have a module that provides the Oracle JDK and sets $JAVA_HOME in the environment, then just set install_java => false in Class['cloudera'] and make sure the JDK is installed before calling Class['cloudera'].
  • Integrated installation of the Oracle Java Cryptography Extension (JCE) unlimited strength jurisdiction policy files. Set the parameter install_jce => true in Class['cloudera'] .

Deprecation Warnings

  • The class parameters and variables yumserver and yumpath have been renamed to reposerver and repopath respectively. This makes the name more generic as it applies to APT and Zypprepo as well as YUM package repositories.
  • The use_gplextras parameter has been renamed to install_lzo.

One note of mention is that this module does not support upgrading from CDH4 to CDH5 packages, including Impala, Search, and GPL Extras.

https://forge.puppetlabs.com/razorsedge/cloudera

https://github.com/razorsedge/puppet-cloudera

Let me know if you have any feedback!