Tag Archives: Linux

Recovering from full Elasticsearch nodes

Recently I ran out of space on a 5 node Elasticsearch cluster. Events were not being indexed, and Logstash had amassed a 10GB disk-backed queue. It was not pretty

I discovered that the fifth node was configured incorrectly and was storing the ES data on one of the smaller disk partitions. I stopped the Elasticsearch service on this node while I formulated a plan.

Unfortunately, I didn’t have the time (or confidence) to move the entire /var directory to the large partition (which happened to be serving the /home folder: mounted as /dev/mapper/centos-home), so I instead created a new folder at /home/elasticsearch (so it would be on the large partition), and “symlinked”/var/elasticsearch to the new home folder on the larger partition ln -s /home/elasticsearch/elasticsearch /var/lib/elasticsearch

After creating the Symlink, I started the Elasticsearch service, and watched the logs. After some time, I noticed that there were still no primary shards assigned to this new nodes (despite it being the only node with disk space utilization below the threshold), so I dug in a bit more

This is where I learned about /_cluster/allocation/explain which provides details about why certain shards may have an allocation problem. Ah ha! After 5 failed attempts to unassigned shards to my new node, Elasticsearch just needed a little kick to re run the allocation process: I opened up the Kibana console, and ran POST /_cluster/reroute?retry_failed=true to force the algorithm to re-evaluate the location of shards

Within about 90 seconds, the Elasticsearch cluster began rerouting all of the unassigned shards, and my logstash disk-queue began to shrink as the events poured into the freshly allocated shards on my new node.

Problem solved.

Stay tuned for next week when I pay off the technical debt incurred by placing my Elasticsearch shards on a symlink ūüė¨

Configuring Multiple Networked OR USB APC UPS devices

As any proper IT-nerd would agree, UPS devices are critical pieces of equipment not only in the data center, but also at home. However, most home users are not in a position to acquire a massive 10+ kilovolt-amp UPS capable of protecting all circuits by which our various “personal devices” are powered; rather, most home/small office UPS installations are often small desktop units usually under 3kVA. In this scenario, these smaller units are typically allocated to individual devices / ares, and are typically only responsible for signaling the status of the incoming utility power to one device.

What about using multiple UPS devices for different components of the same “workspace”? Or home networks with access points and switches in more than one location (therefore each having its own battery backup)? How would one monitor multiple distributed battery backup units (presuming each UPS unit has only USB connectivity)?

APCUPSD

Enter apcupsd: “A daemon for controlling APC UPSes.” Unfortunately, the plurality of this utility’s tagline indicates a wide range of supported devices rather than multiple concurrently connected devices. To date, I’ve found one article describing how to configure apcupsd to support multiple USB-attached UPS devices, and it’s not really pretty. The gist of the process is as follows:

  1. Configure udev rules to ensure a consistent mapping (by UPS serial number) to a named mount point
  2. Create multiple apcupsd configuration files for each connected UPS
  3. Create new “action” and “control” scripts for each UPS
  4. Re-configure the apcupsd init.d/systemd scripts to launch multiple instances of the daemon (one for each UPS)

I’m generally not a fan of creating large “custom” configuration files in obscure locations with great variance from the distributed package, so this process seemed a little “hackey” to me; especially since the end result of all of these configuration files was to have “isolated processes” for each UPS to monitor.

Dockerizing APCUPSD

At this point, I decided to take a different approach to isolating each apcupsd process: an approach with far greater discoverability, version-control potential, and scalability. Docker.

I decided to use the first step outlined in the apcupsd guide on the Debian Wiki (creating udev rules to ensure physical devices are given a persistent path on boot/attach). UPS devices are generally mounted at /dev/usb/hiddev*, so we should confirm that we have a few present:

# ls /dev/usb
hiddev0  hiddev1

# lsusb
Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 004 Device 002: ID 051d:0002 American Power Conversion Uninterruptible Power Supply
Bus 004 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 003 Device 002: ID 051d:0002 American Power Conversion Uninterruptible Power Supply
Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub

Great! we’ve got things that look like UPS devices on /dev/usb/hiddev0 and /dev/usb/hiddev1. Now to get the serialnumbers:

# udevadm info --attribute-walk --name=/dev/usb/hiddev0 | egrep 'manufacturer|product|serial'
    ATTRS{manufacturer}=="American Power Conversion"
    ATTRS{product}=="Back-UPS BX1500G FW:866.L5 .D USB FW:L5 "
    ATTRS{serial}=="8975309"
    ATTRS{manufacturer}=="Linux 4.4.0-170-generic ohci_hcd"
    ATTRS{product}=="OHCI PCI host controller"
    ATTRS{serial}=="0000:00:02.0"
# udevadm info --attribute-walk --name=/dev/usb/hiddev1 | egrep 'manufacturer|product|serial'
    ATTRS{manufacturer}=="American Power Conversion"
    ATTRS{product}=="Back-UPS NS 1100M2 FW:953.e3 .D USB FW:e3     "
    ATTRS{serial}=="8675310"
    ATTRS{manufacturer}=="Linux 4.4.0-170-generic ohci_hcd"
    ATTRS{product}=="OHCI PCI host controller"
    ATTRS{serial}=="0000:00:04.0"

With the now known serial numbers, we create udev rules to persist these devices to known map points:

## FILE AT /lib/udev/rules.d/ups.rules

# SCREEN UPS
KERNEL=="hiddev*", ATTRS{manufacturer}=="American Power Conversion", ATTRS{serial}=="8975309", OWNER="root", SYMLINK+="usb/ups-screen"

# ComputeAndNetwork UPS
KERNEL=="hiddev*", ATTRS{manufacturer}=="American Power Conversion", ATTRS{serial}=="8675310", OWNER="root", SYMLINK+="usb/ups-compute-and-network"

And now to re-run the udev rules:

udevadm trigger --verbose --sysname-match=hiddev*

Now, we should have some “nicely named” UPS USB devices:

# ls -la /dev/usb
total 0
drwxr-xr-x  2 root root    120 Dec 18 19:55 .
drwxr-xr-x 22 root root   4280 Dec 18 19:55 ..
crwxrwxrwx  1 root root 180, 0 Dec 18 19:55 hiddev0
crwxrwxrwx  1 root root 180, 1 Dec 18 19:55 hiddev1
lrwxrwxrwx  1 root root      7 Dec 18 19:55 ups-compute-and-network -> hiddev1
lrwxrwxrwx  1 root root      7 Dec 18 19:55 ups-screen -> hiddev0

Excellent! Now, anytime these devices are plugged/unplugged, we shouldn’t have to guess which is hiddev0 and which is hiddev1, since udev will automagically provide us named mount points for these USB devices, which will be critical to the next steps

Next, I created a docker-compose file with the three “services” I decided I’d like for this setup:

  • APCUPSD for the “screens” UPS
  • APCUPSD for the “Compute and Network” UPS
  • Apache/Multimon to provide an HTTP based interface

This docker-compose file also contained pointers to specific Dockerfiles to actually build an image for each service (hint: the two apcupsd services use the same container with different configurations).

The apcupsd container is nothing more than the latest Alpine linux image; apcupsd from the apk repository and a very lightweight apcupsd configuration files (configured to watch onlythe UPS at /dev/ups – more on this later)

The multimon container uses the latest Apache/alpine image, and adds apcupsd-webif from the apk repository along with a few configuration files for multimon. Additionally, I wrote an entrypoint.sh script to parse environment variables and generate a configuration file for multimon so that the UPS(s) displayed on the web interface can be set from the docker-compose file.

Having now covered the build process, let’s put together the docker-compose services:

Full Docker setup here: https://github.com/crossan007/APCUPSD-Multimon-Docker

Now, instead of attempting to create custom init scripts, multiplex processes within systemd, and override the packaged mechanisms for apcupsd‘s configuration discovery, I instead have a cleanly defined interface for isolating instances of apcupsd to provide a status page for my two APC UPS devices.

Thanks for reading, and hopefully this helps you in some way!

VAInfo: Verify Hardware Accelerated Video Support

On Ubuntu (and possibly other) Linux distros, run vainfo to see which Intel QuickSync profiles are supported.

For example, these profiles are supported on an Intel Haswell chip:

libva info: VA-API version 0.39.0
libva info: va_getDriverName() returns 0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/i965_drv_video.so
libva info: Found init function __vaDriverInit_0_39
libva info: va_openDriver() returns 0
vainfo: VA-API version: 0.39 (libva 1.7.0)
vainfo: Driver version: Intel i965 driver for Intel(R) Haswell Desktop - 1.7.0
vainfo: Supported profile and entrypoints
      VAProfileMPEG2Simple            : VAEntrypointVLD
      VAProfileMPEG2Simple            : VAEntrypointEncSlice
      VAProfileMPEG2Main              : VAEntrypointVLD
      VAProfileMPEG2Main              : VAEntrypointEncSlice
      VAProfileH264ConstrainedBaseline: VAEntrypointVLD
      VAProfileH264ConstrainedBaseline: VAEntrypointEncSlice
      VAProfileH264Main               : VAEntrypointVLD
      VAProfileH264Main               : VAEntrypointEncSlice
      VAProfileH264High               : VAEntrypointVLD
      VAProfileH264High               : VAEntrypointEncSlice
      VAProfileH264MultiviewHigh      : VAEntrypointVLD
      VAProfileH264MultiviewHigh      : VAEntrypointEncSlice
      VAProfileH264StereoHigh         : VAEntrypointVLD
      VAProfileH264StereoHigh         : VAEntrypointEncSlice
      VAProfileVC1Simple              : VAEntrypointVLD
      VAProfileVC1Main                : VAEntrypointVLD
      VAProfileVC1Advanced            : VAEntrypointVLD
      VAProfileNone                   : VAEntrypointVideoProc
      VAProfileJPEGBaseline           : VAEntrypointVLD

References:

Display HTTPS X509 Cert from Linux CLI

Recently, while attempting a git pull, I was confronted with the following error:

Peer's certificate issuer has been marked as not trusted by the user.

The operation worked on a browser on my dev machine, and closer inspection revealed that the cert used to serve the GitLab service was valid, but for some reason the remote CentOS Linux server couldn’t pull from the remote.

I found this post on StackOverflow detailing how to retrieve the X509 cert used to secure an HTTPS connection:

echo | openssl s_client -showcerts -servername MyGitServer.org -connect MyGitServer.org:443 2>/dev/null | openssl x509 -inform pem -noout -text

This was my ticket to discover why Git on my CentOS server didn’t like the certificate: the CentOS host was resolving the wrong DNS host name, and therefore using an invalid cert for the service.

And now a Haiku:

http://i.imgur.com/eAwdKEC.png

Backup Google Authenticator Database

Two factor authentication is great – I wish¬†everything would use it. ¬† My personal 2FA (specifically TOTP) ¬†mobile app is Google Authenticator. ¬†It allows you to scan a barcode, or manually enter a 2FA initilization token, and gives you a nice display of all of your stored 2FA tokens, with a great countdown of the token’s expiration. ¬†However, it does have one critical flaw feature: ¬†You can’t export your accounts.

Let me re-state that: ¬†Your 2FA tokens are locked away in your mobile device. ¬†Without the device, you’re locked out of your accounts (Hopefully you created backup codes). ¬†If your device becomes inoperable, good luck!

However, if you have root access to your device, you can grab the Google Authenticator database and stow it away for safe keeping by grabbing it from the following location on your phone:

/data/data/com.google.android.apps.authenticator2/

If you have ADB enabled, you can just run the following command:

 adb pull /data/data/com.google.android.apps.authenticator2 

Keep this information very secure, as it can be used to generate 2FA codes for all of your accounts!

Troubleshooting OwnCloud index.php

Sometimes OwnCloud includes “index.php” in the shared links. ¬†It’s annoying and ugly. ¬†Here’s some things to check:

  1. Is mod rewrite enabled in the apache config?
    <Directory /var/www/html/owncloud/>
     Options Indexes FollowSymLinks MultiViews
     AllowOverride All
     Order allow,deny
     Allow from all
     <IfModule mod_dav.c>
      Dav off
      </IfModule>
     SetEnv HOME /var/www/html/owncloud
     SetEnv HTTP_HOME /var/www/html/owncloud
    </Directory>
    
  2. Is the .htaccess correct?  The ###DO NOT EDIT### Section must contain this line (Generally the last line in the IfModule for mod_rewrite
    RewriteRule .* index.php [PT,E=PATH_INFO:$1]
    
  3. .htaccess must also contain this block for the web app to generate URLs without “index.php”
    <IfModule mod_rewrite.c>
      RewriteBase /
      <IfModule mod_env.c>
        SetEnv front_controller_active true
        <IfModule mod_dir.c>
          DirectorySlash off
        </IfModule>
      </IfModule>
    </IfModule>
    

Those are my findings for making sure OwnCloud URLs stay pretty.

Unifi Controller on 16.04

Steps to install the UniFi controller on Ubuntu 16.04.  Note that the package depends on JRE7, so we must add the ppa repo to apt.

echo "deb http://www.ubnt.com/downloads/unifi/debian stable ubiquiti" > /etc/apt/sources.list.d/ubnt.list
apt-key adv --keyserver keyserver.ubuntu.com --recv C0A52C50

sudo add-apt-repository ppa:openjdk-r/ppa
sudo apt-get update

sudo apt-get install unifi

Expand Ubuntu LVM

Expand an existing Ubuntu LVM without creating additional partitions, or adding to LVM VGs:

  1. Expand the physical device (VMware, HyperV, DD to a new physical device,  etc)
  2. Use offline GParted cd to re size the extended partition on which LVM lives
  3. In live OS, use parted “resizepart” to extend the logical partition inside of the previously re sized extended partition
    (parted) resizepart
    Partition number? 5
    End? [268GB]? 1099GB
    
  4. reboot
  5. use LVM to resize the PV:
    pvresize /dev/sda5
  6. resize the filesystem in the LV:
    resize2fs

 

References:

Managing ZFS Snapshots

ZFS is a great file system, providing much flexibility, and simple administration.   It was originally developed for Sun operating systems, but has been ported to Linux, and support is now baked in to Ubuntu Server 15.10.

ZFS natively supports snapshots, but there is a tool for automagically creating, and aging snapshots: https://github.com/zfsnap/zfsnap.  (Man page: http://www.zfsnap.org/zfsnap_manpage.html)

Installation is fairly straight forward:

  • clone the Git repo
  • copy the files to /usr/local/src
  • create a link to the binaries path:
    ln -s /usr/local/src/zfsnap/sbin/zfsnap.sh /usr/local/sbin/zfsnap

After that, set up a crontab for automatic snapshot creation:

crontab -e
26 * * * * /usr/local/sbin/zfsnap snapshot -a 5d -r tank

And finally, set up crontab for automatic snapshot deletion:

 0  1 * * * /usr/local/sbin/zfsnap destroy -r tank