aws-labs

How to set up a VPN between strongSwan and Cloud VPN

This guide walks you through how to configure strongSwan for integration with Google Cloud VPN. This information is provided as an example only. This guide is not meant to be a comprehensive overview of IPsec and assumes basic familiarity with the IPsec protocol.

Environment overview

The equipment used in the creation of this guide is as follows:

  • Vendor: strongSwan
  • Software release: 5.5.1 on Debian 9.6

Topology

The topology outlined by this guide is a basic site-to-site IPsec VPN tunnel configuration using the referenced device:

Topology

Before you begin

Prerequisites

To use a strongSwan with Cloud VPN make sure the following prerequisites have been met:

  • VM or Server that runs strongSwan is healthy and has no known issues.
  • There is root access to the strongSwan instance.
  • Your on-premises firewall allows UDP port 500, UDP port 4500, and ESP packets.
  • You should be able to configure your on-premises router to route traffic through strongSwan VPN gateway. Some environments might not give you that option.

IPsec parameters

Cloud VPN supports an extensive list of ciphers that can be used per your security policies. The following parameters and values are used in the Gateway’s IPsec configuration for the purpose of this guide.

ParameterValue
IPsec ModeTunnel mode
Auth protocolPre-shared-key
Key ExchangeIKEv2
StartAuto
Perfect Forward Secrecy (PFS)on

These are the Cipher configuration settings for IKE phase 1 and phase 2 that are used in this guide.

PhaseCipher roleCipher
Phase-1Encryptionaes256gcm16
(ike)Integritysha512
Diffie-Helmanmodp4096 (Group 16)
Phase1 lifetime36,000 seconds
Phase-2Encryptionaes256gcm16
(esp)Integritysha512
Diffie-Helmanmodp8192 (Group 18)
Phase2 lifetime10,800 seconds

Configuring policy-based IPsec VPN

Below is a sample environment to walk you through the setup of a policy-based VPN. Make sure to replace the IP addresses in the sample environment with your own IP addresses.

Cloud VPN

NameValue
Cloud VPN(external IP)35.204.151.163
VPC CIDR192.168.0.0/24

strongSwan

NameValue
External IP35.204.200.153
CIDR Behind strongSwan10.164.0.0/20

Configuration of Google Cloud

To configure Cloud VPN:

  1. In the Cloud Console, select Networking > Create VPN connection.
  2. Click CREATE VPN CONNECTION.
  3. Populate the fields for the gateway and tunnel as shown in the following table, and click Create:
ParameterValueDescription
Namegcp-to-strongswan-1Name of the VPN gateway.
DescriptionVPN tunnel connection between GCP and strongSwanDescription of the VPN connection.
Networkto-swThe Google Cloud network the VPN gateway attaches to. This network will get VPN connectivity.
Regioneurope-west4The home region of the VPN gateway. Make sure the VPN gateway is in the same region as the subnetworks it is connecting to.
IP addressgcp-to-strangswan(35.204.151.163)The VPN gateway uses the static public IP address. An existing, unused, static public IP address within the project can be assigned, or a new one created.
Remote peer IP address35.204.200.153Public IP address of the on-premises VPN appliance used to connect to the Cloud VPN.
IKE versionIKEv2The IKE protocol version. You can select IKEv1 or IKEv2.
Shared secretsecretA shared secret used for authentication by the VPN gateways. Configure the on-premises VPN gateway tunnel entry with the same shared secret.
Routing optionsPolicy-basedMultiple routing options for the exchange of route information between the VPN gateways. This example uses static routing.
Remote network IP ranges10.164.0.0/20The on-premises CIDR blocks connecting to Google Cloud from the VPN gateway.
Local IP ranges192.168.0.0/24The Google Cloud IP ranges matching the selected subnet.

Configuration of strongSwan

To install strongSwan on Debian 9.6 or Ubuntu 18.04, use the following commands:

sudo apt update
sudo apt install strongswan strongswan-pki

To install strongSwan on RHEL 7 or CentOS 7, use the following command:

yum install strongswan

Step 1: Ensure that IP forwarding is enabled

The Server that hosts strongSwan acts as a gateway, so it’s required to net.ipv4.ip_forwarding sysctl.

To check its current status, you can use following command:

sysctl net.ipv4.ip_forward

To temporary enable it (until reboot), you can use following command:

sysctl -w net.ipv4.ip_forward=1

To make changes permanent, you should add a line to sysctl.conf:

/etc/sysctl.d/99-forwarding.conf:

net.ipv4.ip_forward = 1

Step 2: Configure IPsec credentials

Ensure that the following line present in file:

/var/lib/strongswan/ipsec.secrets.inc

35.204.151.163 : PSK "secret"

Step 3: Configure the IPSec connection

/var/lib/strongswan/ipsec.conf.inc

include /etc/ipsec.d/gcp.conf

/etc/ipsec.d/gcp.conf

conn %default
    ikelifetime=600m # 36,000 s
    keylife=180m # 10,800 s
    rekeymargin=3m
    keyingtries=3
    keyexchange=ikev2
    mobike=no
    ike=aes256gcm16-sha512-modp4096
    esp=aes256gcm16-sha512-modp8192
    authby=psk

conn net-net
    left=35.204.200.153 # In case of NAT set to internal IP, e.x. 10.164.0.6
    leftid=35.204.200.153
    leftsubnet=10.164.0.0/20
    leftauth=psk
    right=35.204.151.163
    rightid=35.204.151.163
    rightsubnet=192.168.0.0/24
    rightauth=psk
    type=tunnel
    # auto=add - means strongSwan won't try to initiate it
    # auto=start - means strongSwan will try to establish connection as well
    # Note that Google Cloud will also try to initiate the connection
    auto=start
    # dpdaction=restart - means strongSwan will try to reconnect if Dead Peer Detection spots
    #                  a problem. Change to 'clear' if needed
    dpdaction=restart

Step 4: Start strongSwan

Now you can start strongSwan:

systemctl start strongswan

After you make sure it’s working as expected, you can add strongSwan to autostart:

systemctl enable strongswan

Configuring a dynamic (BGP) IPsec VPN tunnel with strongSwan and BIRD

In this example, a dynamic BGP-based VPN uses a VTI interface. This guide is based on the official strongSwan wiki.

The following sample environment walks you through set up of a route-based VPN. Make sure to replace the IP addresses in the sample environment with your own IP addresses.

This guide assumes that you have BIRD 1.6.3 installed on your strongSwan server.

Cloud VPN

NameValue
Cloud VPN(external IP)35.204.151.163
VPC CIDR192.168.0.0/24
TUN-INSIDE GCP169.254.2.1
GCP-ASN65000

strongSwan

NameValue
External IP35.204.200.153
CIDR Behind strongSwan10.164.0.0/20
TUN-INSIDE- SW169.254.2.2
strongSwan ASN65002

Configuration of Google Cloud

With a route-based VPN, you can use both static and dynamic routing. This example uses dynamic (BGP) routing. Cloud Router is used to establish BGP sessions between the two peers.

Configuring a cloud router

Step 1: In the Cloud Console, select Networking > Cloud Routers > Create Router.

Step 2: Enter the following parameters, and click Create.

ParameterValueDescription
Namegcp-to-strongswan-router-1Name of the cloud router.
DescriptionDescription of the cloud router.
Networkto-swThe Google Cloud network the cloud router attaches to. This is the network that manages route information.
Regioneurope-west4The home region of the cloud router. Make sure the cloud router is in the same region as the subnetworks it is connecting to.
Google ASN65000The Autonomous System Number assigned to the cloud router. Use any unused private ASN (64512 – 65534, 4200000000 – 4294967294).

Configuring Cloud VPN

Step 1: In the Cloud Console, select Networking > Interconnect > VPN > CREATE VPN CONNECTION.

Step 2: Enter the following parameters for the Compute Engine VPN gateway:

ParameterValueDescription
Namegcp-to-strongswan-1Name of the VPN gateway.
DescriptionVPN tunnel connection between GCP and strongSwanDescription of the VPN connection.
Networkto-swThe Google Cloud network the VPN gateway attaches to. This network will get VPN connectivity.
Regioneurope-west4The home region of the VPN gateway. Make sure the VPN gateway is in the same region as the subnetworks it is connecting to.
IP addressgcp-to-strangswan(35.204.151.163)The VPN gateway uses the static public IP address. An existing, unused, static public IP address within the project can be assigned, or a new one created.

Step 3: Enter the following parameters for the tunnel:

ParameterValueDescription
Namegcp-to-strongswan-1-tunnel-1Name of the VPN gateway
DescriptionVPN tunnel connection between GCP and strongSwanDescription of the VPN gateway
Remote peer IP address35.204.200.153Public IP address of the on-premises VPN appliance used to connect to the Cloud VPN.
IKE versionIKEv2The IKE protocol version. You can select IKEv1 or IKEv2.
Shared secretsecretA shared secret used for authentication by the VPN gateways. Configure the on-premises VPN gateway tunnel entry with the same shared secret.
Routing optionsDynamic(BGP)Multiple routing options for the exchange of route information between the VPN gateways. This example uses static routing.
Cloud Routergcp-to-strongswan-router-1Select the cloud router you created previously.
BGP sessionBGP sessions enable your cloud network and on-premises networks to dynamically exchange routes

Step 4: Enter the parameters as shown in the following table for the BGP peering:

ParameterValueDescription
Namegcp-to-strongswan-bgpName of the BGP session.
Peer ASN65002Unique BGP ASN of the on-premises router.
Google BGP IP address169.254.2.1
Peer BGP IP address169.254.2.2

Click Save and Continue to complete.

Note: Add ingress firewall rules to allow inbound network traffic as per your security policy.

Configuration of strongSwan

This guide assumes that you have strongSwan already installed. It also assumes a default layout of Debian 9.6.

Step 1: Configure BIRD

/etc/bird/bird.conf

# Config example for bird 1.6 
#debug protocols all;

router id 169.254.2.2;

# Watch interface up/down events
protocol device {
       scan time 10;
}

# Import interface routes (Connected)
# (Not required in this example as kernel import all is used here to workaround the /32 on eth0 GCE VM setup)
#protocol direct {
#       interface "*";
#}

# Sync routes to kernel
protocol kernel {
       learn;
       merge paths on; # For ECMP
       export filter {
              krt_prefsrc = 10.164.0.6; # Internal IP Address of the strongSwan VM.
              accept; # Sync all routes to kernel
       };
       import all; # Required due to /32 on GCE VMs for the static route below
}

# Configure a static route to make sure route exists
protocol static {
       # Network connected to eth0
       route 10.164.0.0/20 recursive 10.164.0.1; # Network connected to eth0
       # Or blackhole the aggregate
       # route 10.164.0.0/20 blackhole;
}

# Prefix lists for routing security
# (Accept /24 as the most specific route)
define GCP_VPC_A_PREFIXES = [ 192.168.0.0/16{16,24} ]; # VPC A address space
define LOCAL_PREFIXES     = [ 10.164.0.0/16{16,24} ];  # Local address space

# Filter received prefixes
filter gcp_vpc_a_in
{
      if (net ~ GCP_VPC_A_PREFIXES) then accept;
      else reject;
}

# Filter advertised prefixes
filter gcp_vpc_a_out
{
      if (net ~ LOCAL_PREFIXES) then accept;
      else reject;
}

template bgp gcp_vpc_a {
       keepalive time 20;
       hold time 60;
       graceful restart aware; # Cloud Router uses GR during maintenance
       #multihop 3; # Required for Dedicated/Partner Interconnect

       import filter gcp_vpc_a_in;
       import limit 10 action warn; # restart | block | disable

       export filter gcp_vpc_a_out;
       export limit 10 action warn; # restart | block | disable
}

protocol bgp gcp_vpc_a_tun1 from gcp_vpc_a
{
       local 169.254.2.2 as 65002;
       neighbor 169.254.2.1 as 65000;
}

Step 2: Disable automatic routes in strongSwan

Routes are handled by BIRD, so you must disable automatic route creation in strongSwan.

/etc/strongswan.d/vti.conf

charon {
    # We will handle routes by ourselves
    install_routes = no
}

Step 3: Create a script that will configure the VTI interface

This script is called every time a new tunnel is established, and it takes care of proper interface configuration, including MTU, etc.

/var/lib/strongswan/ipsec-vti.sh

#!/bin/bash
set -o nounset
set -o errexit

IP=$(which ip)

PLUTO_MARK_OUT_ARR=(${PLUTO_MARK_OUT//// })
PLUTO_MARK_IN_ARR=(${PLUTO_MARK_IN//// })

VTI_TUNNEL_ID=${1}
VTI_REMOTE=${2}
VTI_LOCAL=${3}

LOCAL_IF="${PLUTO_INTERFACE}"
VTI_IF="vti${VTI_TUNNEL_ID}"
# GCP's MTU is 1460, so it's hardcoded
GCP_MTU="1460"
# ipsec overhead is 73 bytes, we need to compute new mtu.
VTI_MTU=$((GCP_MTU-73))

case "${PLUTO_VERB}" in
    up-client)
        ${IP} link add ${VTI_IF} type vti local ${PLUTO_ME} remote ${PLUTO_PEER} okey ${PLUTO_MARK_OUT_ARR[0]} ikey ${PLUTO_MARK_IN_ARR[0]}
        ${IP} addr add ${VTI_LOCAL} remote ${VTI_REMOTE} dev "${VTI_IF}"
        ${IP} link set ${VTI_IF} up mtu ${VTI_MTU}

        # Disable IPSEC Policy
        sysctl -w net.ipv4.conf.${VTI_IF}.disable_policy=1

        # Enable loosy source validation, if possible. Otherwise disable validation.
        sysctl -w net.ipv4.conf.${VTI_IF}.rp_filter=2 || sysctl -w net.ipv4.conf.${VTI_IF}.rp_filter=0

        # If you would like to use VTI for policy-based you should take care of routing by yourselv, e.x.
        #if [[ "${PLUTO_PEER_CLIENT}" != "0.0.0.0/0" ]]; then
        #    ${IP} r add "${PLUTO_PEER_CLIENT}" dev "${VTI_IF}"
        #fi
        ;;
    down-client)
        ${IP} tunnel del "${VTI_IF}"
        ;;
esac

# Enable IPv4 forwarding
sysctl -w net.ipv4.ip_forward=1

# Disable IPSEC Encryption on local net
sysctl -w net.ipv4.conf.${LOCAL_IF}.disable_xfrm=1
sysctl -w net.ipv4.conf.${LOCAL_IF}.disable_policy=1

You should also make /var/lib/strongswan/ipsec-vti.sh executable by using following command:

chmod +x /var/lib/strongswan/ipsec-vti.sh

Step 4: Configure IPsec credentials

Ensure that the following line is in the file:

/var/lib/strongswan/ipsec.secrets.inc

35.204.151.163 : PSK "secret"

Step 5: Configure IPsec connection

/var/lib/strongswan/ipsec.conf.inc

include /etc/ipsec.d/gcp.conf

/etc/ipsec.d/gcp.conf

conn %default
    ikelifetime=600m # 36,000 s
    keylife=180m # 10,800 s
    rekeymargin=3m
    keyingtries=3
    keyexchange=ikev2
    mobike=no
    ike=aes256gcm16-sha512-modp4096
    esp=aes256gcm16-sha512-modp8192
    authby=psk

conn net-net
    leftupdown="/var/lib/strongswan/ipsec-vti.sh 0 169.254.2.1/30 169.254.2.2/30"
    left=35.204.200.153 # In case of NAT set to internal IP, e.x. 10.164.0.6
    leftid=35.204.200.153
    leftsubnet=0.0.0.0/0
    leftauth=psk
    right=35.204.151.163
    rightid=35.204.151.163
    rightsubnet=0.0.0.0/0
    rightauth=psk
    type=tunnel
    # auto=add - means strongSwan won't try to initiate it
    # auto=start - means strongSwan will try to establish connection as well
    # Note that Google Cloud will also try to initiate the connection
    auto=start
    # dpdaction=restart - means strongSwan will try to reconnect if Dead Peer Detection spots
    #                  a problem. Change to 'clear' if needed
    dpdaction=restart
    # mark=%unique - We use this to mark VPN-related packets with iptables
    #                %unique ensures that all tunnels will have a unique mark here
    mark=%unique

leftupdown contains a path to a script and its command-line parameters: * The first parameter is the tunnel ID because you cannot rely on strongSwan’s PLUTO_UNIQUEID variable if you need the tunnel ID to be persistent. * The second parameter specifies the Cloud Router IP and configured subnet. * The third parameter specifies the IP address of the vti0 interface and where BIRD is configured.

Step 3: Start strongSwan and BIRD

systemctl start bird
systemctl start strongswan

After you make sure it’s working as expected, you can add BIRD and strongSwan to autostart:

systemctl enable bird
systemctl enable strongswan

Hits: 46

How to Shrink a WSL2 Virtual Disk

Before you begin

Before shrinking a WSL2 virtual disk, you need to ensure that WSL2 is not running.

You can check if it’s running with the command ‘wsl.exe --list --verbose‘ in PowerShell:

PowerShell

PS C:\Users\valorin> wsl.exe --list --verbose
  NAME            STATE           VERSION
* WLinux          Running         2
  Debian          Stopped         2
  Ubuntu-18.04    Stopped         2
  kali-linux      Stopped         2

It should stop when it’s idle, or you can encourage it to stop with the ‘wsl.exe --terminate‘ command:

PowerShell

PS C:\Users\valorin> wsl.exe --terminate WLinux

I also highly recommend you take a backup of your WSL2 installation.

These instructions worked for me, but you could have a different environment that may result in corrupted data. So please, take a backup first!

Use diskpart to Shrink a WSL2 Virtual Disk

I discovered you can use the ‘diskpart‘ tool to compact a VHDX. This allows you to shrink a WSL2 virtual disk file, reclaiming disk space. It appeared to work for me without any data corruption, taking the file size down from 100GB to 15GB. Your results may vary though.

You can launch the diskpart tool in PowerShell:

PowerShell

PS C:\Users\valorin> diskpart

It will open up a new window:

Using diskpart tool to Shrink a WSL2 Virtual Disk

Once that has opened, you need to specify the path to your VHDX file.

If you don’t know this path, you can find by first locating the package directory for your WSL2 instance, which lives in: C:\Users\valorin\AppData\Local\Packages\. Look for the vendor name, such as WhitewaterFoundryLtd.Co for PengwinCanonicalGroupLimited for Ubuntu, or TheDebianProject for Debian. Once you’ve identified the folder, you’ll find the VHDX in the LocalState subdirectory.

For me, this path is:
C:\Users\valorin\AppData\Local\Packages\WhitewaterFoundryLtd.Co.16571368D6CFF_kd...\LocalState\ext4.vhdx

With the full path to the VHDX, you can select it within diskpart:

Shell

DISKPART> select vdisk file="C:\Users\valorin\AppData\Local\Packages\WhitewaterFoundryLtd.Co.16571368D6CFF_kd...\LocalState\ext4.vhdx"

DiskPart successfully selected the virtual disk file.

Once it’s selected, you can ask diskpart to compact it:

Shell

DISKPART> compact vdisk

  100 percent completed

DiskPart successfully compacted the virtual disk file.	
Once that has finished, you can close diskpart.

If you check your VHDX now, you should see it has reduced in size. It depends how much empty space was being used by WSL2 as to how big a space reduction there will be. In my case, it was quite significant:

Before

PowerShell

Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
d-----         9/02/2020  12:04 PM                temp
-a----         9/02/2020   1:04 PM    94778687488 ext4.vhdx
-a----        29/07/2019   3:48 PM              0 fsserver

After

PowerShell

Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
d-----         9/02/2020  12:04 PM                temp
-a----         9/02/2020   1:04 PM    14533263360 ext4.vhdx
-a----        29/07/2019   3:48 PM              0 fsserver

I hope you found this useful.

Hits: 205

How to Setup Local APT Repository Server on Ubuntu 20.04

One of the reasons why you may consider setting up a local apt repository server is to minimize the bandwidth required if you have multiple instances of Ubuntu to update. Take for instance a situation where you have 20 or so servers that all need to be updated twice a week. You could save a great deal of bandwidth because all you need to do is to updates all your systems over a LAN from your local repository server.

In this guide, you will learn how to set up a local apt repository server on Ubuntu 20.04 LTS.

Prerequisites

  • Ubuntu 20.04 LTS system
  • Apache Web Server
  • Minimum of 170 GB free disk space on /var/www/html file system
  • Stable internet connection

Step 1) Create a local Apache Web Server

First off, log in to your Ubuntu 20.04 and set up the Apache web server as shown.

$ sudo apt install -y apache2

Enable Apache2 service so that it will be persistent across the reboot . Run following command

$ sudo systemctl enable apache2

Apache’s default document root directory is located in the /var/www/html path. We are later going to create a repository directory in this path that will contain the required packages needed.

Step 2) Create a package repository directory

Next, we will create a local repository directory called ubuntu in the /var/www/html path.

$ sudo mkdir -p /var/www/html/ubuntu

Set the required permissions on above created directory.

$ sudo chown www-data:www-data /var/www/html/ubuntu

Step 3) Install apt-mirror

The next step is to install apt-mirror package, after installing this package we will get apt-mirror command or tool which will download and sync the remote debian packages to local repository on our server. So to install it, run following

$ sudo apt update
$ sudo apt install -y apt-mirror

Step 4) Configure repositories to mirror or sync

Once apt-mirror is installed then its configuration ‘/etc/apt/mirrror.list’ is created automatically. This file contains list of repositories that will be downloaded or sync in local folder of our Ubuntu server. In our case local folder is ‘/var/www/html/ubuntu/’. Before making changes to this file let’s backup first.

$ sudo cp /etc/apt/mirror.list /etc/apt/mirror.list-bak

Now edit the file using vi editor and update base_path and repositories as shown below.

$ sudo vi /etc/apt/mirror.list

############# config ###################
set base_path    /var/www/html/ubuntu
set nthreads     20
set _tilde 0
############# end config ##############
deb http://archive.ubuntu.com/ubuntu focal main restricted universe \
 multiverse
deb http://archive.ubuntu.com/ubuntu focal-security main restricted \
universe multiverse
deb http://archive.ubuntu.com/ubuntu focal-updates main restricted \
universe multiverse
clean http://archive.ubuntu.com/ubuntu

Save and exit the file.

APT-Mirror-List-File-Ubuntu-Server

In case you might have noticed that I have used Ubuntu 20.04 LTS package repositories and have comment out the src package repositories as I don’t have enough space on my system. If you wish to download or sync src packages too then uncomment the lines which starts with ‘deb-src’.

Step 5) Start mirroring the remote repositories to local folder

Before start mirroring or syncing, first the copy the postmirror.sh script to folder /var/www/html/ubuntu/var using below cp command.

$ sudo mkdir -p /var/www/html/ubuntu/var
$ sudo cp /var/spool/apt-mirror/var/postmirror.sh /var/www/html/ubuntu/var

Now, it’s time to start mirroring the packages from remote repositories to our system’s local folder. Execute below:

$ sudo apt-mirror
APT-Mirror-Command-Output-Ubuntu-Server

Above command can also be started in the background using below nohup command,

$ nohup sudo apt-mirror &

To monitor the mirroring progress use below,

$ tail nohup.out

In Ubuntu 20.04 LTS, apt-mirror does sync CNF directory and its files, so we have to manually download and copy the folder and its files. So to avoid manually downloading CNF directory, create a shell script with below contents,

$ vi cnf.sh
#!/bin/bash
for p in "${1:-focal}"{,-{security,updates}}\
/{main,restricted,universe,multiverse};do >&2 echo "${p}"
wget -q -c -r -np -R "index.html*"\
 "http://archive.ubuntu.com/ubuntu/dists/${p}/cnf/Commands-amd64.xz"
wget -q -c -r -np -R "index.html*"\
 "http://archive.ubuntu.com/ubuntu/dists/${p}/cnf/Commands-i386.xz"
done

save and close the script.

Execute the script to download CNF directory and its files.

$ chmod +x cnf.sh
$ bash  cnf.sh

This script will create a folder with name ‘archive.ubuntu.com’ in the present working directory. Copy this folder to mirror folder,

$ sudo cp -av archive.ubuntu.com  /var/www/html/ubuntu/mirror/

Note : If we don’t sync cnf directory then on client machines we will get following errors, so to resolve these errors we have to create and execute above script.

E: Failed to fetch http://x.x.x.x/ubuntu/mirror/archive.ubuntu.com/ubuntu/dists/\
focal/restricted/cnf/Commands-amd64  404  Not Found [IP:169.144.104.219 80]
E: Failed to fetch http://x.x.x.x/ubuntu/mirror/archive.ubuntu.com/ubuntu/dists/\
focal-updates/main/cnf/Commands-amd64  404  Not Found [IP:169.144.104.219 80]
E: Failed to fetch http://x.x.x.x/ubuntu/mirror/archive.ubuntu.com/ubuntu/dists/\
focal-security/main/cnf/Commands-amd64  404  Not Found [IP:169.144.104.219 80]

Scheduling Automatic Repositories Sync Up

Configure a cron job to automatically update our local apt repositories. It is recommended to setup this cron job in the night daily.

Run ‘crontab -e’ and add following command to be executed daily at 1:00 AM in the night.

$ sudo crontab -e

00  01  *  *  *  /usr/bin/apt-mirror

Save and close.

Note: In case Firewall is running on Ubuntu Server then allow port 80 using following command

$ sudo ufw allow 80

Step 6) Accessing Local APT repository via web browser

To Access our locally configured apt repository via web browser type the following URL:

http://<Server-IP>/ubuntu/mirror/archive.ubuntu.com/ubuntu/dists/

Local-Apt-Repository-Web-Ubuntu

Step 7) Configure Ubuntu 20.04 client to use local apt repository server

To test and verify whether our apt repository server is working fine or not, I have another Ubuntu 20.04 lts system where I will update /etc/apt/sources.list file so that apt command points to local repositories instead of remote.

So, login to the system, change the following in the sources.list

http://archive.ubuntu.com/ubuntu
to
http://169.144.104.219/ubuntu/mirror/archive.ubuntu.com/ubuntu

Here ‘169.144.104.219’ is the IP Address of my apt repository server, replace this ip address that suits to your environment.

Also make sure comment out all other repositories which are not mirrored on our apt repository server. So, after making the changes in sources.list file, it would look like below:

Ubuntu-Client-Sources-list

Now Run ‘apt update’ command to verify that client machine is getting update from our local apt repository server,

$ sudo apt update
Apt-Update-Ubuntu-Client

Perfect, above output confirms that client machine is successfully able to connect to our repository for fetching the packages and updates. That’s all from this article, I hope this guide helps you to setup local apt repository server on Ubuntu 20.04 system.

copied from -> https://www.linuxtechi.com/setup-local-apt-repository-server-ubuntu/

Hits: 195

Logical Volume Manager (LVM) versus standard partitioning in Linux

Traditional storage management

I use the phrase traditional storage management to describe the process of partitioning, formatting, and mounting storage capacity from a basic hard disk drive. I contrast this standard partitioning with an alternative method called Logical Volume Manager, or LVM.

Storage space is typically managed based on the maximum capacity of individual hard disk drives. The result is that when a sysadmin thinks about storage, they do so based on each drive. For example, if a server has three hard disk drives of 1 TB each, the sysadmin considers the storage literally, I have three 1 TB drives to work with.

Administrator with thinking bubble that reads: "I have three hard drives of one terabyte each"
An administrator thinks of standard partitions based on individual drive capacity.
Three 1 TB hard drives with partitions and mount points
Three 1 TB hard drives with partitions and mount points. The partitions are entirely contained on the individual hard disk drives.

Let’s very quickly review traditional storage management. Here is a sample scenario:

1. Install a new hard disk drive

Purchase a one terabyte (1 TB) hard disk drive, and then physically install it into the server.

2. Partition the drive

Use fdisk or gparted to create one or more partitions. It’s important to note that the partitions cannot consume more than the total 1 TB of disk capacity.

Example fdisk command:

# fdisk /dev/sdb

I won’t cover the syntax for fdisk in this article, but assume I created a single partition that consumes the entire 1 TB disk. The partition is /dev/sdb1.

Display the capacity by using the /proc/partitions and lsblk content:

# cat /proc/partitions
# lsblk

3. Create a filesystem

Create a filesystem on the new partition by using the mkfs command. You could use ext4 or RHEL’s default XFS filesystem.

# mkfs.ext4 /dev/sdb1

While XFS is Red Hat’s default, it may not be as flexible when combined with LVM as ext4XFS filesystems can easily be extended but not reduced. I’ll expand on that idea further toward the end of the article.

4. Create a mount point

The rest of this process is relatively standard. First, create a directory to serve as a mount point. Next, manually mount the partition to the mount point.

# mkdir /newstorage
# mount /dev/sdb1 /newstorage

5. Confirm the storage capacity

Use the du command to confirm the storage space is accessible and of the expected size.

# du -h /newstorage

Great Linux resources

Note: The -h option displays the output of du in capacity terms that are easy for humans to understand, such as GB or TB.

6. Configure the space to mount at boot

Edit the /etc/fstab file to mount the filesystem at boot. If you need a reminder on /etc/fstab, check out Tyler Carrigan’s article An introduction to the Linux /etc/fstab file here on Enable Sysadmin.

Logical Volume Manager (LVM)

Traditional storage capacity is based on individual disk capacity. LVM uses a different concept. Storage space is managed by combining or pooling the capacity of the available drives. With traditional storage, three 1 TB disks are handled individually. With LVM, those same three disks are considered to be 3 TB of aggregated storage capacity. This is accomplished by designating the storage disks as Physical Volumes (PV), or storage capacity useable by LVM. The PVs are then added to one or more Volume Groups (VGs). The VGs are carved into one or more Logical Volumes (LVs), which then are treated as traditional partitions.

Administrator with thinking bubble that reads: "I have three terabyte of total storage"
An administrator thinks of LVM as total combined storage space.
Red Hat diagram displaying three physical volumes combined into a single volume group with two logical volumes allocated from the volume group.
Three hard disk drives are combined into one volume group that is then carved into two logical volumes.

Source: Red Hat LVM Architecture Overview

1. Install a new hard disk drive

Obviously, there needs to be a storage disk available. Just as we saw above, you must physically install a drive in the server.

2. Designate Physical Volumes

Physical Volumes (PV) are disks or partitions that are available to LVM as potential storage capacity. They have identifiers and metadata that describes each PV. It is interesting to note that, as opposed to RAID, PVs do not have to be the same size or or on disks that are the same speed. You can mix and match drive types to create PVs. To implement LVM, first designate a drive as a Physical Volume.

Command to create a PV:

# pvcreate /dev/sdb1
# pvcreate /dev/sdc

These two command examples are slightly different. The first command designates partition 1 on storage disk b as a PV. The second command sets the total capacity of storage disk c as a PV.

Display PV capacity and additional information:

# pvdisplay

This command displays all of the Physical Volumes configured on the server.

3. Manage Volume Groups

Once one or more of the disks are available to LVM as Physical Volumes, the storage capacity is combined into Volume Groups (VGs). There may be more than one VG on a server, and disks may be members of more than one VG (but PVs themselves may only be members of one VG).

Use the vgcreate command to create a new Volume Group. The VG must have at least one member. The command syntax is:

vgcreate name-of-new-VG PV-members

Use the following command to create a Volume Group named vg00 with /dev/sdb1 and /dev/sdc as members:

# vgcreate vg00 /dev/sdb1 /dev/sdc

Display information for a VG named vg00:

# vgdisplay vg00

4. Manage Logical Volumes

The VG can be subdivided into one or more Logical Volumes (LVs). These Logical Volumes are then used as if they were traditional partitions. The VG has a total capacity, and then some part of that capacity is allocated to a Logical Volume.

The lvcreate command carves storage capacity from a VG. There are a few options to be aware of.

OptionDescription
-nName of LV – ex. sales-lv
-LSize in G or T – ex. 10G
-qQuiet, suppresses command output
-vVerbose mode providing additional details

The syntax for the lvcreate command is as follows:

lvcreate -L size -n lvname vgname

Here is the command to create a 10 GB Logical Volume named sales-lv carved from the vg00 Volume Group:

# lvcreate -L 10G -n sales-lv vg00

As you recall, we created the vg00 Volume Group from two Physical Volumes, /dev/sdb1 and /dev/sdc. So, in summary, we combined the capacity of /dev/sdb1 and /dev/sdc into vg00, then carved a Logical Volume named sales-lv from that aggregated storage space.

You can use the lvdisplay command to see the Logical Volume’s configuration.

# lvdisplay /dev/vg00/sales-lv

5. Apply a filesystem and set a mount point

Once the LV is created, it is managed as any other partition. It needs a filesystem and a mount point, just like we configured in the standard partition management section above.

  1. Run the mkfs.ex4 command on the LV.
  2. Create a mount point by using mkdir.
  3. Manually mount the volume using the mount command, or edit the /etc/fstab file to mount the volume automatically when the system boots.
  4. Use the df -h command to verify the storage capacity is available.

[ You might also like: Introduction to Logical Volume Manager ]

Scaling capacity

At this stage, we’ve seen the configuration of LVM, but we really have not yet been able to see the many benefits. One of the benefits of LVM configurations is the ability to scale storage capacity easily and quickly. Usually, of course, sysadmins need to scale up (increase capacity). It’s worthwhile to note that you can also scale storage capacity down with LVM. That means that if storage capacity is over-allocated (you configured far more storage than you needed to), you can shrink it. I will cover both scenarios in this section.

Let’s start with increasing capacity.

Increase capacity

You can add storage capacity to the Logical Volume. This is useful if the users consume more space than you anticipated. The process is pretty logical:

  1. Add a disk and configure it as a PV.
  2. Add it to a VG.
  3. Add the capacity to the LV and then extend the filesystem.

1. Install a storage disk and then configure it as a PV

To increase capacity, install a new disk and configure it as a PV, as per the steps above. If there is already a disk with free space available, you can certainly use that, as well.

Here is a reminder of the command to create a PV:

# pvcreate /dev/sdb2

In this case, I am designating partition 2 on disk /dev/sdb as the new PV.

2. Add space to the VG

Once the new capacity is designated for LVM, you can add it to the VG, increasing the pool’s size.

Run this command to add a new PV to an existing VG:

# vgextend vg00 /dev/sdb2

Now the storage pool is larger. The next step is to add the increased capacity to the specific Logical Volume. You can allocate any or all of the PV storage space you just added to the pool to the existing LV.

3. Add space to the LV

Next, add some or all of the new VG storage space to the LV that needs to be expanded.

Run the lvextend command to extend the LV to a given size:

# lvextend -L3T /dev/vg00/sales-lv

Run the lvextend command to add 1 GB of space to the existing size:

# lvextend -L+1G /dev/vg00/sales-lv

4. Extend the file system to make the storage capacity available

Finally, extend the file system. Both ext4 and XFS support this ability, so either filesystem is fine.

Unmount the filesystem by using the umount command:

# umount /newstorage

Here is the basic command for ext4:

# resize2fs /dev/vg00/sales-lv 3T

Reduce capacity

Reducing storage space is a less common task, but it’s worth noting. The process occurs in the opposite order from expanding storage.

NoteXFS filesystems are not actually shrunk. Instead, back up the content and then restore it to a newly-resized LV. You can use the xfsdump utility to accomplish this. The ext4 filesystem can be reduced. That’s the filesystem I focus on in this section. As we saw above with extending the filesystem, the volume must be unmounted. The exact command will vary depending on your LV name.

# umount /dev/vg00/sales-lv

1. Shrink the filesystem

Next, use the resize2fs command to reduce the filesystem size. It is recommended that you run fsck on ext4 filesystems before shrinking them. It is also recommended that you back up the data on the LV in case something unexpected occurs.

Here is an example of shrinking an ext4 filesystem:

# resize2fs /dev/vg00/sales-lv 3T

Note: You cannot shrink a filesystem to a size smaller than the amount of data stored on it.

2. Reduce the LV

Use the lvreduce command to shrink the storage space allocated to the LV. This returns the potential storage capacity to the VG.

# lvreduce -L 2T vg00/sales-lv

It is critical to realize that the above command sets the sales-lv at 2T. It does not remove two terabytes from the existing LV. It configures the LV at two terabytes. It is possible to tell lvreduce to subtract an amount of space from the existing capacity by using a very similar command:

# lvreduce -L -2T vg00/sales-lv

In this case, I added a - (dash) before the 2T size, indicating I want that quantity of space subtracted from the existing sales-lv capacity. The difference between the two commands is small but important.

You now have the returned capacity to the VG for use in another LV. You can use the extend commands discussed earlier to reallocate this capacity. The VG can also be shrunk.

Flexibility

Capacity can also be easily reallocated with LVM. You can reduce capacity in one VG and add it to another. This is accomplished by shrinking the filesystem and then removing the LV from the VG. Let’s say you have a server with 10 TB of capacity. Using the above processes, you have created two LVs of 5 TB each. After a few weeks, you discover that you should have created LVs of 7 TB and 3 TB instead. You can remove 2 TB of capacity from one of the Volume Groups and then add that capacity to the other VG. This is far more flexibility than traditional partitioning offers.

LVM also supports RAID configurations, mirroring, and other advanced settings that make it an even more attractive solution. Tyler Carrigan’s article Creating Logical Volumes in Linux with LVM has some good information on striping and mirroring Logical Volumes.

[ Want to test your sysadmin skills? Take a skills assessment today. ] 

Wrap up

Logical Volume Manager is a great way to add flexibility to your storage needs, and it’s really not much more complicated than traditional drive management. There is great documentation at Red Hat’s site, and the official training courses cover it very effectively, as well.

One of the things that I appreciate about LVM is the logical command structure. The majority of the management commands are related and therefore relatively easy to remember. The point of the following table is not to summarize the commands, but rather for you to notice that the commands are all very similar and therefore user-friendly:

CommandDescription
pvcreateCreate physical volume
pvdisplayDisplay physical volume information
pvsDisplay physical volume information
pvremoveRemove physical volume
vgcreateCreate volume group
vgdisplayDisplay volume group information
vgsDisplay volume group information
vgremoveRemove volume group
vgextend/vgreduceExtend or reduce volume group
lvcreateCreate logical volume
lvdisplayDisplay logical volume information
lvsDisplay logical volume information
lvremoveRemove logical volume
lvextend/lvextendExtend or reduce logical volume

The next time you are standing up a local file server, consider using LVM instead of traditional storage management techniques. You may thank yourself months or years down the line as you need to adjust the server’s storage capacity

from -> https://www.redhat.com/sysadmin/lvm-vs-partitioning

Hits: 19

Understanding the Nginx Configuration File Structure and Configuration Contexts

Introduction

Nginx is a high performance web server that is responsible for handling the load of some of the largest sites on the internet. It is especially good at handling many concurrent connections and excels at serving static content.

While many users are aware of Nginx’s capabilities, new users are often confused by some of the conventions they find in Nginx configuration files. In this guide, we will focus on discussing the basic structure of an Nginx configuration file along with some guidelines on how to design your files.

Understanding Nginx Configuration Contexts

This guide will cover the basic structure found in the main Nginx configuration file. The location of this file will vary depending on how you installed the software on your machine. For many distributions, the file will be located at /etc/nginx/nginx.conf. If it does not exist there, it may also be at/usr/local/nginx/conf/nginx.conf or /usr/local/etc/nginx/nginx.conf.

One of the first things that you should notice when looking at the main configuration file is that it appears to be organized in a tree-like structure, defined by sets of brackets (that look like { and }). In Nginx parlance, the areas that these brackets define are called “contexts” because they contain configuration details that are separated according to their area of concern. Basically, these divisions provide an organizational structure along with some conditional logic to decide whether to apply the configurations within.

Because contexts can be layered within one another, Nginx provides a level of directive inheritance. As a general rule, if a directive is valid in multiple nested scopes, a declaration in a broader context will be passed on to any child contexts as default values. The children contexts can override these values at will. It is worth noting that an override to any array-type directives will replace the previous value, not append to it.

Directives can only be used in the contexts that they were designed for. Nginx will error out on reading a configuration file with directives that are declared in the wrong context. The Nginx documentation contains information about which contexts each directive is valid in, so it is a great reference if you are unsure.

Below, we’ll discuss the most common contexts that you’re likely to come across when working with Nginx.

The Core Contexts

The first group of contexts that we will discuss are the core contexts that Nginx utilizes in order to create a hierarchical tree and separate the concerns of discrete configuration blocks. These are the contexts that comprise the major structure of an Nginx configuration.

The Main Context

The most general context is the “main” or “global” context. It is the only context that is not contained within the typical context blocks that look like this:

# The main context is here, outside any other contexts

. . .

context {

    . . .

}

Any directive that exist entirely outside of these blocks is said to inhabit the “main” context. Keep in mind that if your Nginx configuration is set up in a modular fashion, some files will contain instructions that appear to exist outside of a bracketed context, but which will be included within such a context when the configuration is stitched together.

The main context represents the broadest environment for Nginx configuration. It is used to configure details that affect the entire application on a basic level. While the directives in this section affect the lower contexts, many of these aren’t inherited because they cannot be overridden in lower levels.

Some common details that are configured in the main context are the user and group to run the worker processes as, the number of workers, and the file to save the main process’s PID. You can even define things like worker CPU affinity and the “niceness” of worker processes. The default error file for the entire application can be set at this level (this can be overridden in more specific contexts).

The Events Context

The “events” context is contained within the “main” context. It is used to set global options that affect how Nginx handles connections at a general level. There can only be a single events context defined within the Nginx configuration.

This context will look like this in the configuration file, outside of any other bracketed contexts:

# main context

events {

    # events context
    . . .

}

Nginx uses an event-based connection processing model, so the directives defined within this context determine how worker processes should handle connections. Mainly, directives found here are used to either select the connection processing technique to use, or to modify the way these methods are implemented.

Usually, the connection processing method is automatically selected based on the most efficient choice that the platform has available. For Linux systems, the epoll method is usually the best choice.

Other items that can be configured are the number of connections each worker can handle, whether a worker will only take a single connection at a time or take all pending connections after being notified about a pending connection, and whether workers will take turns responding to events.

The HTTP Context

When configuring Nginx as a web server or reverse proxy, the “http” context will hold the majority of the configuration. This context will contain all of the directives and other contexts necessary to define how the program will handle HTTP or HTTPS connections.

The http context is a sibling of the events context, so they should be listed side-by-side, rather than nested. They both are children of the main context:

# main context

events {
    # events context

    . . .

}

http {
    # main context

    . . .

}

While lower contexts get more specific about how to handle requests, directives at this level control the defaults for every virtual server defined within. A large number of directives are configurable at this context and below, depending on how you would like the inheritance to function.

Some of the directives that you are likely to encounter control the default locations for access and error logs (access_log and error_log), configure asynchronous I/O for file operations (aiosendfile, and directio), and configure the server’s statuses when errors occur (error_page). Other directives configure compression (gzip and gzip_disable), fine-tune the TCP keep alive settings (keepalive_disablekeepalive_requests, and keepalive_timeout), and the rules that Nginx will follow to try to optimize packets and system calls (sendfiletcp_nodelay, and tcp_nopush). Additional directives configure an application-level document root and index files (root and index) and set up the various hash tables that are used to store different types of data (*_hash_bucket_size and*_hash_max_size for server_namestypes, and variables).

The Server Context

The “server” context is declared within the “http” context. This is our first example of nested, bracketed contexts. It is also the first context that allows for multiple declarations.

The general format for server context may look something like this. Remember that these reside within the http context:

# main context

http: {

    # http context

    server {

        # first server context

    }

    server {

        # second server context

    }

}

The reason for allowing multiple declarations of the server context is that each instance defines a specific virtual server to handle client requests. You can have as many server blocks as you need, each of which can handle a specific subset of connections.

Due to the possibility and likelihood of multiple server blocks, this context type is also the first that Nginx must use a selection algorithm to make decisions. Each client request will be handled according to the configuration defined in a single server context, so Nginx must decide which server context is most appropriate based on details of the request. The directives which decide if a server block will be used to answer a request are:

  • listen: The ip address / port combination that this server block is designed to respond to. If a request is made by a client that matches these values, this block will potentially be selected to handle the connection.
  • server_name: This directive is the other component used to select a server block for processing. If there are multiple server blocks with listen directives of the same specificity that can handle the request, Nginx will parse the “Host” header of the request and match it against this directive.

The directives in this context can override many of the directives that may be defined in the http context, including logging, the document root, compression, etc. In addition to the directives that are taken from the http context, we also can configure files to try to respond to requests (try_files), issue redirects and rewrites (return and rewrite), and set arbitrary variables (set).

The Location Context

The next context that you will deal with regularly is the location context. Location contexts share many relational qualities with server contexts. For example, multiple location contexts can be defined, each location is used to handle a certain type of client request, and each location is selected by virtue of matching the location definition against the client request through a selection algorithm.

While the directives that determine whether to select a server block are defined within the server context itself, the component that decides on a location’s ability to handle a request is located in the location definition itself.

The general syntax looks like this:

location match_modifier location_match {

    . . .

}

Location blocks live within server contexts and, unlike server blocks, can be nested inside one another. This can be useful for creating a more general location context to catch a certain subset of traffic, and then further processing it based on more specific criteria with additional contexts inside:

# main context

server {

    # server context

    location /match/criteria {

        # first location context

    }

    location /other/criteria {

        # second location context

        location nested_match {

            # first nested location

        }

        location other_nested {

            # second nested location

        }

    }

}

While server contexts are selected based on the requested IP address/port combination and the host name in the “Host” header, location blocks further divide up the request handling within a server block by looking at the request URI. The request URI is the portion of the request that comes after the domain name or IP address/port combination.

So, if a client requests http://www.example.com/blog on port 80, the httpwww.example.com, and port 80 would all be used to determine which server block to select. After a server is selected, the /blogportion (the request URI), would be evaluated against the defined locations to determine which further context should be used to respond to the request.

Many of the directives you are likely to see in a location context are also available at the parent levels. New directives at this level allow you to reach locations outside of the document root (alias), mark the location as only internally accessible (internal), and proxy to other servers or locations (using http, fastcgi, scgi, and uwsgi proxying).

Other Contexts

While the above examples represent the essential contexts that you will encounter with Nginx, other contexts exist as well. The contexts below were separated out either because they depend on more optional modules, they are used only in certain circumstances, or they are used for functionality that most people will not be using.

We will not be discussing each of the available contexts though. The following contexts will not be discussed in any depth:

  • split_clients: This context is configured to split the clients that the server receives into categories by labeling them with variables based on a percentage. These can then be used to do A/B testing by providing different content to different hosts.
  • perl / perl_set: These contexts configures Perl handlers for the location they appear in. This will only be used for processing with Perl.
  • map: This context is used to set the value of a variable depending on the value of another variable. It provides a mapping of one variable’s values to determine what the second variable should be set to.
  • geo: Like the above context, this context is used to specify a mapping. However, this mapping is specifically used to categorize client IP addresses. It sets the value of a variable depending on the connecting IP address.
  • types: This context is again used for mapping. This context is used to map MIME types to the file extensions that should be associated with them. This is usually provided with Nginx through a file that is sourced into the main nginx.conf config file.
  • charset_map: This is another example of a mapping context. This context is used to map a conversion table from one character set to another. In the context header, both sets are listed and in the body, the mapping takes place.

The contexts below are not as common as the ones we have discussed so far, but are still very useful to know about.

The Upstream Context

The upstream context is used to define and configure “upstream” servers. Basically, this context defines a named pool of servers that Nginx can then proxy requests to. This context will likely be used when you are configuring proxies of various types.

The upstream context should be placed within the http context, outside of any specific server contexts. The general form looks something like this:

# main context

http {

    # http context

    upstream upstream_name {

        # upstream context

        server proxy_server1;
        server proxy_server2;

        . . .

    }

    server {

        # server context

    }

}

The upstream context can then be referenced by name within server or location blocks to pass requests of a certain type to the pool of servers that have been defined. The upstream will then use an algorithm (round-robin by default) to determine which specific server to hand the request to. This context gives our Nginx the ability to do some load balancing when proxying requests.

The Mail Context

Although Nginx is most often used as a web or reverse proxy server, it can also function as a high performance mail proxy server. The context that is used for directives of this type is called, appropriately, “mail”. The mail context is defined within the “main” or “global” context (outside of the http context).

The main function of the mail context is to provide an area for configuring a mail proxying solution on the server. Nginx has the ability to redirect authentication requests to an external authentication server. It can then provide access to POP3 and IMAP mail servers for serving the actual mail data. The mail context can also be configured to connect to an SMTP Relayhost if desired.

In general, a mail context will look something like this:

# main context

events {

    # events context

}

mail {

    # mail context

}

The If Context

The “if” context can be established to provide conditional processing of directives defined within. Like an if statement in conventional programming, the if directive in Nginx will execute the instructions contained if a given test returns “true”.

The if context in Nginx is provided by the rewrite module and this is the primary intended use of this context. Since Nginx will test conditions of a request with many other purpose-made directives, if should not be used for most forms of conditional execution. This is such an important note that the Nginx community has created a page called if is evil.

The problem is basically that the Nginx processing order can very often lead to unexpected results that seem to subvert the meaning of an if block. The only directives that are considered reliably safe to use inside of these contexts are the return and rewrite directives (the ones this context was created for). Another thing to keep in mind when using an if context is that it renders a try_files directive in the same context useless.

Most often, an if will be used to determine whether a rewrite or return is needed. These will most often exist in location blocks, so the common form will look something like this:

# main context

http {

    # http context

    server {

        # server context

        location location_match {

            # location context

            if (test_condition) {

                # if context

            }

        }

    }

}

The Limit_except Context

The limit_except context is used to restrict the use of certain HTTP methods within a location context. For example, if only certain clients should have access to POST content, but everyone should have the ability to read content, you can use a limit_except block to define this requirement.

The above example would look something like this:

. . .

# server or location context

location /restricted-write {

    # location context

    limit_except GET HEAD {

        # limit_except context

        allow 192.168.1.1/24;
        deny all;
    }
}

This will apply the directives inside the context (meant to restrict access) when encountering any HTTP methods except those listed in the context header. The result of the above example is that any client can use the GET and HEAD verbs, but only clients coming from the 192.168.1.1/24 subnet are allowed to use other methods.

General Rules to Follow Regarding Contexts

Now that you have an idea of the common contexts that you are likely to encounter when exploring Nginx configurations, we can discuss some best practices to use when dealing with Nginx contexts.

Apply Directives in the Highest Context Available

Many directives are valid in more than one context. For instance, there are quite a few directives that can be placed in the http, server, or location context. This gives us flexibility in setting these directives.

However, as a general rule, it is usually best to declare directives in the highest context to which they are applicable, and overriding them in lower contexts as necessary. This is possible because of the inheritance model that Nginx implements. There are many reasons to use this strategy.

First of all, declaring at a high level allows you to avoid unnecessary repetition between sibling contexts. For instance, in the example below, each of the locations is declaring the same document root:

http {
    server {
        location / {
            root /var/www/html;

            . . .

        }

        location /another {
            root /var/www/html;

            . . .

        }

    }
}

You could move the root out to the server block, or even to the http block, like this:

http {
    root /var/www/html;
    server {
        location / {

            . . .

        }

        location /another {

            . . .

        }
    }
}

Most of the time, the server level will be most appropriate, but declaring at the higher level has its advantages. This not only allows you to set the directive in fewer places, it also allows you to cascade the default value down to all of the child elements, preventing situations where you run into an error by forgetting a directive at a lower level. This can be a major issue with long configurations. Declaring at higher levels provides you with a sane default.

Use Multiple Sibling Contexts Instead of If Logic for Processing

When you want to handle requests differently depending on some information that can be found in the client’s request, often users jump to the “if” context to try to conditionalize processing. There are a few issues with this that we touched on briefly earlier.

The first is that the “if” directive often return results that do not align with the administrator’s expectations. Although the processing will always lead to the same result given the same input, the way that Nginx interprets the environment can be vastly different than can be assumed without heavy testing.

The second reason for this is that there are already optimized, purpose-made directives that are used for many of these purposes. Nginx already engages in a well-documented selection algorithm for things like selecting server blocks and location blocks. So if it is possible, it is best to try to move your different configurations into their own blocks so that this algorithm can handle the selection process logic.

For instance, instead of relying on rewrites to get a user supplied request into the format that you would like to work with, you should try to set up two blocks for the request, one of which represents the desired method, and the other that catches messy requests and redirects (and possibly rewrites) them to your correct block.

The result is usually easier to read and also has the added benefit of being more performant. Correct requests undergo no additional processing and, in many cases, incorrect requests can get by with a redirect rather than a rewrite, which should execute with lower overhead.

Conclusion

By this point, you should have a good grasp on Nginx’s most common contexts and the directive that create the blocks that define them.

Always check Nginx’s documentation for information about which contexts a directive can be placed in and to evaluate the most effective location. Taking care when creating your configurations will not only increase maintainability, but will also often increase performance.

Hits: 83

Debian Lenny – Installing Apache and PHP

Apache Install

A basic Apache install is very easy:

 
# sudo aptitude install apache2 apache2.2-common apache2-mpm-prefork apache2-utils libexpat1 ssl-cert

ServerName

Towards the end of the install you will see this warning:

 
apache2: Could not reliably determine the server's fully qualified domain name,
using 127.0.0.1 for ServerName

Although I’ll be going into some detail about the options and settings available in the main apache configuration file, let’s fix that warning straight away.

Open the main apache config:

 
# sudo nano /etc/apache2/apache2.conf

At the bottom of the file add the following:

 
ServerName demo

Change the ServerName to your Slice hostname or a FQDN (remember this demo Slice has a hostname of ‘demo’).

Once done, save apache2.conf and gracefully restart Apache (this method of restarting won’t kill open connections):

 
# sudo apache2ctl graceful

Now the warning has gone. Nice.

Default Page

If you navigate to your Slice IP address:

 
http://123.45.67.890

You will see the default ‘It works!’ screen:

apache-itworks.jpg

Now we have the base Apache install completed, we can move onto installing PHP. If you don’t require PHP then please feel free to skip the next section.

PHP5 Install

In this example, I’m not going to install all the modules available. Just some common ones.

To see what modules are available try a:

 
# sudo aptitude search php5-

Note the ‘-‘ at the end of ‘php5’. This will show any packages that start with ‘php5-‘ and shows the available modules.

As before, due to using aptitude to install PHP5, any dependencies are taken care of:

 
# sudo aptitude install libapache2-mod-php5 php5 php5-common php5-curl php5-dev php5-gd \
php5-imagick php5-mcrypt php5-memcache php5-mhash php5-mysql php5-pspell php5-snmp \
php5-sqlite php5-xmlrpc php5-xsl

Once done, do a quick Apache reload:

 
# sudo /etc/init.d/apache2 reload

PHP Test

Before we go any further, it is always a good idea to test your setup and make sure that everything we’ve already done is working as we expect.

We can test that Apache and PHP are playing nicely together very easily by creating a simple php file with a call to the phpinfo method and then loading it in our web browser.

Let’s create the file:

 
# sudo nano -w /var/www/test.php

Now we will add some basic HTML and a call to the phpinfo method to the file:

 
<html>
<head>
<title> PHP Test Script </title>
</head>
<body>
<?php
phpinfo( );
?>
</body>
</html>

Great, now we should be able to load that script in our web browser using your Slice IP address:

 
http://123.45.67.890/test.php

If everything is installed properly, you should see a PHP generated page that displays all sorts of information about your PHP installation:

php-test.jpg

Great! No need to worry about what all that means for now. We just wanted to verify that PHP was working.

Now that we know it is, let’s go ahead and remove that test script, we don’t need the whole world knowing about our PHP installation.

 
# sudo rm /var/www/test.php

Hits: 12

Automatic DNS for Kubernetes Ingresses with ExternalDNS

ExternalDNS is a relatively new Kubernetes Incubator project that makes Ingresses and Services available via DNS. It currently supports AWS Route 53 and Google Cloud DNS. There are several similar tools available with varying features and capabilities like route53-kubernetesMate, and the DNS controller from Kops. While it is not there yet, the goal is for ExternalDNS to include all of the functionality of the other options by 1.0.

In this post, we will use ExternalDNS to automatically create DNS records for Ingress resources on AWS.

Deploying the Ingress Controller

An Ingress provides inbound internet access to Kubernetes Services running in your cluster. The Ingress consists of a set of rules, based on host names and paths, that define how requests are routed to a backend Service. In addition to an Ingress resource, there needs to be an Ingress controller running to actually handle the requests. There are several Ingress controller implementations available: GCETraefikHAProxyRancher, and even a shiny, brand new AWS ALB-based controller. In this example, we are going to use the Nginx Ingress controller on AWS.

Deploying the nginx-ingress controller requires creating several Kubernetes resources. First, we need to deploy a default backend server. If a request arrives that does not match any of the Ingress rules, it will be routed to the default backend which will return a 404 response. The defaultbackend Deployment will be backed by a ClusterIP Servicethat listens on port 80.

The nginx-ingress controller itself requires three Kubernetes resources. The Deployment to run the controller, a ConfigMap to hold the controller’s configuration, and a backing Service. Since we are working with AWS, we will deploy a LoadBalancerService. This will create an Elastic Load Balancer in front of the nginx-ingress controller. The architecture looks something like this:

     internet
        |
     [ ELB ]
        |
 [ nginx-ingress ]
   --|-----|--
   [ Services ]

We will deploy the nginx-ingress controller using the example manifests in the kubernetes/ingress repository.

kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress/master/examples/aws/nginx/nginx-ingress-controller.yaml

At the time of this writing, this deploys a beta version (0.9.0-beta.5) of the nginx-ingress controller. The 0.9.x release of the ingress controller is necessary in order to work with ExternalDNS.

Now that we’ve deployed our Ingress controller, we can move on to our DNS configuration.

ExternalDNS currently requires full access to a single managed zone in Route 53 — it will delete any records that are not managed by ExternalDNS.

Warning: do not use an existing zone containing important DNS records with ExternalDNS. You will lose records.

If you already have a domain registered in Route 53 that you can dedicate to use for ExternalDNS, feel free to use that. In this post, I will instead show how you can create a subdomain in its own isolated Route 53 hosted zone. I am assuming for the purposes of this post that the parent domain is also hosted in Route 53. However, it is possible to use a subdomain even if the parent domain is not hosted in Route 53. In the following examples, I have a domain named ryaneschinger.com registered in Route 53 and I will be creating a new hosted zone for extdns.ryaneschinger.com dedicated to ExternalDNS.

Here is a small script we can use to configure the zone for our subdomain. Note that it depends on the indispensable jq utility.

export PARENT_ZONE=ryaneschinger.com
export ZONE=extdns.ryaneschinger.com

# create the hosted zone for the subdomain
aws route53 create-hosted-zone --name ${ZONE} --caller-reference "$ZONE-$(uuidgen)"

# capture the zone ID
export ZONE_ID=$(aws route53 list-hosted-zones | jq -r ".HostedZones[]|select(.Name == \"${ZONE}.\")|.Id")

# create a changeset template
cat >update-zone.template.json <<EOL
{
  "Comment": "Create a subdomain NS record in the parent domain",
  "Changes": [{
    "Action": "UPSERT",
    "ResourceRecordSet": {
      "Name": "",
      "Type": "NS",
      "TTL": 300,
      "ResourceRecords": []
    }
  }]
}
EOL

# generate the changeset for the parent zone
cat update-zone.template.json \
 | jq ".Changes[].ResourceRecordSet.Name=\"${ZONE}.\"" \
 | jq ".Changes[].ResourceRecordSet.ResourceRecords=$(aws route53 get-hosted-zone --id ${ZONE_ID} | jq ".DelegationSet.NameServers|[{\"Value\": .[]}]")" > update-zone.json

# create a NS record for the subdomain in the parent zone
aws route53 change-resource-record-sets \
  --hosted-zone-id $(aws route53 list-hosted-zones | jq -r ".HostedZones[] | select(.Name==\"$PARENT_ZONE.\") | .Id" | sed 's/\/hostedzone\///') \
  --change-batch file://update-zone.json

We are using the AWS CLI to manage our zones in this post but you are probably better off using tools like Terraform or CloudFormation to manage your zones. You can also use the AWS management console if you must.

IAM Permissions

ExternalDNS will require the necessary IAM permissions to view and manage your hosted zone. There are a few ways you can grant these permissions depending on how you build and manage your Kubernetes installation on AWS. If you are using Kops, you can add additional IAM policies to your nodes. If you require finer grained control, take a look at kube2iam. This is the policy I am using for ExternalDNS on my cluster:

[
  {
    "Effect": "Allow",
    "Action": [
      "route53:ChangeResourceRecordSets",
      "route53:ListResourceRecordSets",
      "route53:GetHostedZone"
    ],
    "Resource": [
      "arn:aws:route53:::hostedzone/<hosted-zone-id>"
    ]
  },
  {
    "Effect": "Allow",
    "Action": [
      "route53:GetChange"
    ],
    "Resource": [
      "arn:aws:route53:::change/*"
    ]
  },
  {
    "Effect": "Allow",
    "Action": [
      "route53:ListHostedZones"
    ],
    "Resource": [
      "*"
    ]
  }
]

If you are following along, you will need to replace the <hosted-zone-id> in the first statement with the correct ID for your zone.

Deploy ExternalDNS

Here is an example Deployment manifest we can use to deploy ExternalDNS:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: external-dns
spec:
strategy:
type: Recreate
template:
metadata:
labels:
app: external-dns
spec:
containers:
– name: external-dns
image: registry.opensource.zalan.do/teapot/external-dns:v0.3.0-beta.0
imagePullPolicy: Always
args:
– –domain-filter=$(DOMAIN_FILTER)
– –source=service
– –source=ingress
– –provider=aws
env:
– name: DOMAIN_FILTER
valueFrom:
configMapKeyRef:
name: external-dns
key: domain-filter

view rawexternal-dns.yml hosted with ❤ by GitHub

A few things to note:

  • ExternalDNS is still in beta. We are using v0.3.0-beta.0 in this example.
  • We are running it with both the service and ingress sources turned on. ExternalDNS can create DNS records for both Services and Ingresses. In this post, we are just working with Ingress resources but ExternalDNS should work with Services as well with this configuration.
  • You must tell ExternalDNS which domain to use. This is done with the --domain-filter argument. The Deployment is configured to read this domain from a ConfigMap that we will create in the next step.
  • We tell ExternalDNS that we are using Route 53 with the --provider=aws argument.

Now we can deploy ExternalDNS. Make sure you change the value of domain-filter in the create configmap command. And, note that it is important that the domain ends with a “.”.

# create the configmap containing your domain
kubectl create configmap external-dns --from-literal=domain-filter=extdns.ryaneschinger.com.

# deploy ExternalDNS
kubectl apply -f https://gist.githubusercontent.com/ryane/620adbe00d3666119d3926910ac31046/raw/808ca3170ddf6549f39c487658eabe5b6faf9045/external-dns.yml

At this point, ExternalDNS should be up, running, and ready to create DNS records from Ingress resources. Let’s see this work with the same example used in the ExternalDNS documentation for GKE.

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: nginx
annotations:
kubernetes.io/ingress.class: nginx
spec:
rules:
– host: nginx.extdns.ryaneschinger.com
http:
paths:
– backend:
serviceName: nginx
servicePort: 80
apiVersion: v1
kind: Service
metadata:
name: nginx
spec:
ports:
– port: 80
targetPort: 80
selector:
app: nginx
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: nginx
spec:
template:
metadata:
labels:
app: nginx
spec:
containers:
– image: nginx
name: nginx
ports:
– containerPort: 80

view rawdemo.yml hosted with ❤ by GitHub

You can use this manifest almost as-is but you do need to change the host rule in the Ingress resources to use your domain. This is what ExternalDNS will use to create the necessary DNS records. Download the file and update it with your domain name:

curl -SLO https://gist.githubusercontent.com/ryane/620adbe00d3666119d3926910ac31046/raw/c43d0e42f63948c50af672e2899858bc11ecaad3/demo.yml

After updating the host rule, we can deploy the demo application:

kubectl apply -f demo.yml

After a minute or two, you should see that ExternalDNS populates your zone with an ALIAS record that points to the ELB for the nginx-ingress controller you deployed earlier. You can check the logs to verify that things are working correctly or to troubleshoot if things are not:

$ kubectl logs -f $(kubectl get po -l app=external-dns -o name)
time="2017-05-04T11:20:39Z" level=info msg="config: &{Master: KubeConfig: Sources:[service ingress] Namespace: FqdnTemplate: Compatibility: Provider:aws GoogleProject: DomainFilter:extdns.ryaneschinger.com. Policy:sync Registry:txt TXTOwnerID:default TXTPrefix: Interval:1m0s Once:false DryRun:false LogFormat:text MetricsAddress::7979 Debug:false}"
time="2017-05-04T11:20:39Z" level=info msg="Connected to cluster at https://100.64.0.1:443"
time="2017-05-04T11:20:39Z" level=info msg="All records are already up to date"
time="2017-05-04T11:21:40Z" level=info msg="Changing records: CREATE {
  Action: "CREATE",
  ResourceRecordSet: {
    AliasTarget: {
      DNSName: "ad8780caf306711e7bea40a080212981-1467976998.us-east-1.elb.amazonaws.com",
      EvaluateTargetHealth: true,
      HostedZoneId: "Z35SXDOTRQ7X7K"
    },
    Name: "nginx.extdns.ryaneschinger.com",
    Type: "A"
  }
} ..."
time="2017-05-04T11:21:40Z" level=info msg="Changing records: CREATE {
  Action: "CREATE",
  ResourceRecordSet: {
    Name: "nginx.extdns.ryaneschinger.com",
    ResourceRecords: [{
        Value: "\"heritage=external-dns,external-dns/owner=default\""
      }],
    TTL: 300,
    Type: "TXT"
  }
} ..."
time="2017-05-04T11:21:40Z" level=info msg="Record in zone extdns.ryaneschinger.com. were successfully updated"
time="2017-05-04T11:22:40Z" level=info msg="All records are already up to date"
time="2017-05-04T11:23:40Z" level=info msg="All records are already up to date"
time="2017-05-04T11:24:40Z" level=info msg="All records are already up to date"

Assuming everything worked correctly, and allowing for propagation time, you should now be able to access the demo application through its dynamically created domain name:

$ curl nginx.extdns.ryaneschinger.com
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html

Very nice. In the next post, we will build upon this and generate TLS certificates for our Ingress resources with Let’s Encrypt.

Hits: 49

Monitoring Spark on Hadoop with Prometheus and Grafana

Anyone who has spent time developing Spark applications (or any other distributed application for that matter) has probably wished for some x-ray goggles into the black-box machinery of the framework. While Spark provides a nice and increasingly feature-rich UI for checking on the status of running tasks and even gives statistics on things like runtime, memory usage, disk I/O etc., there are other aspects of the runtime that can remain an annoying mystery: how is the JVM memory being utilized? How much memory is the driver using? What about garbage collection? As it turns out, all these are reported by Spark’s metrics systemthey’re out there, you just need to grab them.

TL;DR: Gaining insight into your Spark applications by collecting Spark metrics with tools like Prometheus is easy and can be done by anyone with or without admin priviledges.

Unfortunately, the documentation regarding the metrics system is rather poor. If you also want to combine the Spark-reported metrics with those generated by Hadoop (YARN, HDFS), then you really embark on another google-powered goose chase for insights drawing on incomplete documentation pages and outdated blogs. I was inspired in this goose-chase by an excellent blog post showing a nice use of Spark metrics (the only one I could find, actually) and set off to do this for my own system. (there is another nice post about using Prometheus to monitor Spark Streaming, but using the JMX exporter instead of Graphite)

Goals

My main goals were two-fold:

  1. use metrics to better understand the JVM runtime of Spark applications
  2. combine spark, hadoop, and system-level metrics to complement performance benchmarks when making system architecture decisions

The first is somewhat obvious – tired of mysterious “Out of memory” exceptions, I want more fine-grained information about where, when, and why the problems arise. It is especially difficult to get any kind of information about off-heap memory usage and garbage collection by standard means in Spark and I want to rectify this situation.

The second is slightly more complex – we are running a 250+ node “test” Spark/Hadoop cluster on somewhat outdated hardware that is being used a sandbox before we purchase a modern state-of-the-art machine. Benchmarks like Terasort on Hadoop or the spark-perf test suite give you timing information but not very much data on what the system is actually doing. What are the raw disk I/O rates on individual nodes? Is the network being saturated? Is HDFS performance hampered by slow disks, network, cpu? When we run the same benchmark on a new system and get a (hopefully) improved time, which of these factors was most important and where could we perhaps downgrade components to save money without sacrificing performance? To answer these questions we really need instrumentation and monitoring.

Choices of monitoring backend and visualization

Graphite

The widely-adopted general-purpose monitoring choice seems to be Graphite. I found it pretty difficult to set up, owing to inconsistent documentation (for example, the top google hit for “graphite monitoring” takes you to outdated docs) and many components that need to play nice together. I spent a day configuring graphite/carbon and had a working system after some headache. When I needed to add Grafana on top of this, I quickly reached for a Vagrant VM setup that worked very well, but I didn’t want to rely on a Vagrant image when I actually tried to deploy this later.

In addition, the built-in Graphite UI is fairly basic at best. The plotting is rather cumbersome and outdated, though I’m sure it’s possible to set up nice dashboards with some effort. Still, it was very useful as an initial metrics browser, just to get a feeling for what is being reported.

Prometheus

A colleague pointed me to Prometheus which on the other hand took me about five seconds to get running. No database/apache configurations needed. Just download the appropriate release and go. Alternatively, you can run it easily via docker.

As an added bonus, I liked a few features of Prometheus that I hadn’t really thought about before trying Graphite:

The data model

The data model allows you to define metrics which are more like “metric containers” and give them fine-grained specifications using “labels”. In essense, the labels are the “dimensions” of each metric. For example, your metric might be “latency” and your labels would be “hostname” and “operating_system”. You can then easily look at aggregate statistics on “latency” or drill down seamlessly to get stats per host or per os. Pretty nice.

The Query Language

This is intimately tied to the data model, but Prometheus comes with a pretty nice query language. Of course you have to learn a few things about the syntax, but once you do it’s pretty easy to use and has some nice features that allow you to take advantage of the multi-dimensionality of the metrics.

Scraping vs. pushing metrics

With Prometheus you have to define endpoints that it will “scrape” — it doesn’t get any data automatically and clients can’t push data to it. This is nice if you want some control over potentially noisy sources. You don’t have to alter the source, you can just stop scraping it for input temporarily.

Grafana

I haven’t experimented very much with the visualization front-end but went straight for Grafana. It was designed to be used with Graphite, but it is now possible to seamlessly insert Prometheus as a data source. Grafana looks good, has nice functionality, and seems fairly general so it seemed like a pretty safe choice.

Connecting Spark with Prometheus

Note: Before you continue here, make sure your Prometheus instance is running and you can reach it at http://localhost:9090 or whatever other port you configured.

Spark doesn’t have Prometheus as one of the pre-packaged sinks – so the strategy here is to ask Spark to export Graphite metrics and feed those into Prometheus via an exporter plugin. To report metrics to Graphite, you must set up metrics via a metrics.properties file. You can put this in $SPARK_HOME/config or pass it to spark on the command line by using --conf spark.metrics.conf=/path/to/metrics/file – beware that this path must either exist on all executors. Alternatively you can pass the file to the executors using the --file flag.

My metrics.properties looks like this:

*.sink.graphite.class=org.apache.spark.metrics.sink.GraphiteSink
*.sink.graphite.host=<metrics_hostname>
*.sink.graphite.port=<metrics_port>
*.sink.graphite.period=5
*.sink.graphite.unit=seconds

# Enable jvm source for instance master, worker, driver and executor
master.source.jvm.class=org.apache.spark.metrics.source.JvmSource

worker.source.jvm.class=org.apache.spark.metrics.source.JvmSource

driver.source.jvm.class=org.apache.spark.metrics.source.JvmSource

executor.source.jvm.class=org.apache.spark.metrics.source.JvmSource

Spark’s monitoring sinks include Graphite, but not Prometheus. Luckily it’s really easy to get Graphite data into Prometheus using theGraphite Exporter, which you can easily get running either by building from source or using the Docker image. Once it’s up, all you need to do is change the port to which your Graphite clients (i.e. Spark in this case) are sending their metrics and you’re set — the default port is 9109 so make sure you set that in your metrics.properties file.

You can go to http://localhost:9108/metrics once the exporter is running to see which metrics it has collected – initially it will only have some internal metrics. To get spark metrics in there, make sure you set up the metrics.properties file and try running the spark pi example:

$ $SPARK_HOME/bin/spark-submit  --master local[*] $SPARK_HOME/examples/src/main/python/pi.py 500

On the http://localhost:9108/metrics you should now see a ton of lines like this:

# HELP local_driver_jvm_heap_init Graphite metric local.driver.jvm.heap.init
# TYPE local_driver_jvm_heap_init gauge
local_driver_jvm_heap_init 1.073741824e+09
# HELP local_driver_jvm_heap_max Graphite metric local-1450274266632.driver.jvm.heap.max
# TYPE local_driver_jvm_heap_max gauge
local_driver_jvm_heap_max 1.029177344e+09
# HELP local_driver_jvm_heap_usage Graphite metric local-1450274266632.driver.jvm.heap.usage
# TYPE local_driver_jvm_heap_usage gauge
local_driver_jvm_heap_usage 0.35
# HELP local_driver_jvm_heap_used Graphite metric local-1450274266632.driver.jvm.heap.used
# TYPE local_driver_jvm_heap_used gauge
local_driver_jvm_heap_used 3.60397752e+08

This is showing us that the Graphite exporter to Prometheus works, but by default all Graphite metrics are sent across just as 1D metrics to Prometheus, i.e. without any label dimensions. To get the data into the Prometheus data model, we have to set up a mapping.

Mapping Spark’s Graphite metrics to Prometheus

The one trick here is that if you just send raw Graphite metrics to Prometheus, you will not be able to use the nice Prometheus query language to its fullest because the metrics data will not have labels.

You can easily define mappings to turn these into proper Prometheus labeled metrics by specifying a mapping config file. Turning these JVM memory metrics into Prometheus metrics can be done with something like this:

*.*.jvm.*.*
name="jvm_memory_usage"
application="$1"
executor_id="$2"
mem_type="$3"
qty="$4"

This instructs the exporter to create a metric named jvm_memory_usage with labels applicationexecutor_idmem_type, and qty. After we restart the exporter with

host:~/graphite_exporter rok$ ./graphite_exporter -graphite.mapping-config graphite_exporter_mapping

and rerun the spark pi example, the metrics now look like this:

jvm_memory_usage{application="application_ID",executor_id="1",mem_type="non-heap",qty="committed"} 3.76832e+07

Great, now we can actually use Prometheus queries on our data!

Here is my full graphite exporter mappings file that will turn Spark Graphite metrics into something usable in Prometheus:

*.*.executor.filesystem.*.*
name="filesystem_usage"
application="$1"
executor_id="$2"
fs_type="$3"
qty="$4"

*.*.jvm.*.*
name="jvm_memory_usage"
application="$1"
executor_id="$2"
mem_type="$3"
qty="$4"

*.*.jvm.pools.*.*
name="jvm_memory_pools"
application="$1"
executor_id="$2"
mem_type="$3"
qty="$4"

*.*.executor.threadpool.*
name="executor_tasks"
application="$1"
executor_id="$2"
qty="$3"

*.*.BlockManager.*.*
name="block_manager"
application="$1"
executor_id="$2"
type="$3"
qty="$4"

DAGScheduler.*.*
name="DAG_scheduler"
type="$1"
qty="$2"

Exploring metrics in Prometheus

To actually see our Spark metrics in Prometheus, we need to tell it to scrape the graphite exporter for data. We do this by adding a job toprometheus.yml below the internal prometheus job declaration:

...

scrape_configs:

...

  - job_name: 'spark'

    target_groups:
      - targets: ['localhost:9108']

Now restart Prometheus (if it was running already) and it should start collecting metrics from the exporter. Rerun the spark pi example to get some metrics collected.

Prometheus comes with a simple web UI that should be accessible on http://localhost:9090. This allows you to try out some queries, for example you can enter this query:

jvm_memory_usage{executor_id='driver', qty='used', application="local-1450275288942"}

but replace the application identifier with your actual application ID and see the values reported back in the “Console” tab or the plot.

Basic Prometheus plot

This is nice to get a first look at your data, but for some sort of user-friendly metrics tracking, we’ll want to set up Grafana.

Using Grafana to Visualize Spark metrics via Prometheus

First, download Grafana or use their Docker container. I found that the build was breaking in ways I wasn’t able to debug very quickly so I resorted to using the Docker container for the purposes of testing.

Once Grafana is running, set up the Prometheus data source:

Prometheus data source

Now you are ready to set up some Grafana dashboards! When adding plots, just select the “Prometheus” data source in the bottom right and enter a query. Here’s an example:

Example of Prometheus plot in Grafana

In this example I’m using a template variable “application_ID” so that I can easily select the application I want. To define your own, go to the “templating” settings:

Templating in Grafana
Templating in Grafana

See the Grafana Prometheus documentation for more information.

Finally, a complete dashboard for a single Spark application showing some individual and aggregate Spark metrics may look like this:

Full Spark Grafana dashboard

You can play with the full snapshot here.

If you want to use this dashboard as a template, you can grab the JSON and import it in your own Grafana instance.

refference: http://rokroskar.github.io/monitoring-spark-on-hadoop-with-prometheus-and-grafana.html

Hits: 564

How to Get Started with Bacula Enterprise and Cloud Storage

Introduction

This article and the image have been created by a joint effort between Bacula Systems S.A, bytemine GmbH, and ProfitBricks GmbH.

We offer an image with a basic Bacula install that should cover everything you need to get started with Bacula Enterprise and cloud storage. We will show you how use Bacula to build a proper and solid backup environment.

This guide is intended to show you around and give some basic examples on how to use Bacula and how to extend your setup. We mainly use Bweb Management Suite in this guide.

Create and start your Bacula server

Get your Bacula Server running in three steps:

  1. Create a Server in the Data Center Designer and attach two storage volumes. The second storage volume will be automatically integrated by the Bacula server and is used as a backup repository. Recommended size for the storage volumes:
    • Volume 1: 30 GB
    • Volume 2: 100 GB or bigger according to your backup needs.
  2. Assign the image bacula_server$ to the first hard disk.
  3. Provision the server. The password of the bweb interface can be found in the /root directory in the file new-bweb-password

Once your system is set up, point your browser to:

http://[your IP address]:9180/bweb

Bacula Enterprise overview

Bacula Enterprise has five main components:

  1. The Director: Supervises all the backup, restore, verify and archive operations.
  2. The Storage Daemon: Performs the storage of the file attributes and data to the physical backup media.
  3. The File Daemon or Client: Installed on each machine to be backed up.
  4. The Catalog: Maintains the file indexes for all files backed up.
  5. The Console: Allows users to interact with the Director. The console is available in two versions:
    • Bconsole: Text-based console interface.
    • BWeb Management Suite: Web interface.
Bacula Enterprise architecture

It is important to understand some of the basic Bacula architecture and terminology before setting up and using your backup environment, as it will ease your task and will avoid future mistakes.

All of the terms used are explained in the concept guide available here.

If you want to dig deeper check out the main manual available here.

Managing your Bacula setup

There are multiple ways to manage your Bacula setup.

For some administrators the bconsole, a text-based console interface, is the way to go.

However in this guide, we will mostly cover the Bweb Management Suite, as it is a good way to start if you are new to Bacula.

Bweb Management Suite is a web interface that interacts with the Director and offers tools to run backup and restore jobs, to monitor and to configure your Bacula infrastructure.

Bweb also offers a lot of help feautures. If you ever get lost, click on one of the help buttons.

Worksets

Bacula uses worksets to track the configuration changes you make. Any time you make a change, it needs to be committed in order to go live.

If you discover that you made a mistake during your configuration, you can clear your workset and start over.

The workset is available at Configuration –> Configure Bacula.Bacula workset

Backup a Linux client with Bweb Management Suite

We will begin by configuring and creating your first backup.

Go to Configuration –> Configure Bacula –> Clients.

As you can see there is already a Bacula client, bacula-fd which has been configured.

There are also some default jobs defined.

To start a job for our bacula-fd, in the main menu go to Jobs –> Defined Jobs. Choose the job you want to start from the drop-down menu.

Defined jobs

Let’s choose the /usr directory. Click on the Run now button.

Another drop down menu will appear. Choose the default values and click Run now again.

You will be taken to the running job information window. You can click Refresh to watch the progress.

Running job information window

Once the job is complete, an automatic refresh will take you to the job report window, where you can see the outcome of the backup job.

You can also find this information by going to Jobs –> Jobs History and clicking on the status icon.

Adding a new backup client

In our next example we’ll add a new client and configure everything needed for a valid backup. The first step is to install the client software and make it accessible.

Installing a file daemon

Bacula file daemons, called bacula-fd, exist for almost all platforms.

We provide a cd image, available in the image section of your datacenter, which can be mounted onto your server.

Add a cd-rom drive, available in the storage section of your server and assign the “bacula-agent-client.iso” to the cd-rom drive. You need to provision your changes for this to be made available.

On windows systems the cd-rom drive will automatically be detected and mounted. You can now browse the image, choose the required version, execute the installer package and follow the instructions.

On Linux systems the following steps have to be taken:

Mounting the Image

mkdir /mnt/bacula-clients 
mount -o ro /dev/cdrom /mnt/bacula-clients/

debian based systems:

Example for an Ubuntu14.04 system:

dpkg -i /mnt/bacula-clients/8.2.7/trusty-64/bacula-enterprise-common_8.2.7-1_amd64.deb
dpkg -i /mnt/bacula-clients/8.2.7/trusty-64/bacula-enterprise-client_8.2.7-1_amd64.deb
apt-get install -f

redhat based systems:

Example for a Centos7 system:

rpm -i /mnt/client-installer/8.2.7/rhel7-64/bacula-enterprise-libs-8.2.7-1.el7.x86_64.rpm
rpm -i /mnt/client-installer/8.2.7/rhel7-64/bacula-enterprise-client-8.2.7-1.el7.x86_64.rpm

suse based systems:

Example for a SLES12 system:

rpm -i /mnt/client-installer/8.2.7/sles12-64/bacula-enterprise-libs-8.2.7-1.su120.x86_64.rpm
rpm -i /mnt/client-installer/8.2.7/sles12-64/bacula-enterprise-client-8.2.7-1.su120.x86_64.rpm

DNS

Be aware that your clients have to be able to resolve the bacula-director via the hostname “bacula”. This can be achieved by adding a DNS ‘A’ record pointing to the IP address of the Director to your own domain name server, or by adding an /etc/hosts entry on your clients.

When adding the client in bweb you have to specify an address where the client is reachable within your network. You can specify an ip address or a fqdn like bacula-client.example.org.

When using a fqdn make sure this is resolvable from your bacula-director.

Firewalling

The bacula-director has to be able to reach the bacula-fd at tcp port 9102. On windows machines this will be blocked by default, if the windows firewall is active.

And your bacula-fd has to be able to reach the bacula storage daemon (bacula-sd) at port 9103. Keep this in mind when putting your system into production.

Overall needed firewall rules:

bacula-dir -> bacula-sd:9103
bacula-dir -> bacula-fd:9102
bacula-fd  -> bacula-sd:9103

Configuring the file daemon

Go to Configuration –> Configure Bacula.

On the left hand side you will find wizards to help you in adding a variety of resources.

Click on Add a new Client resource. The wizard will guide you through the next steps.

Configure a new client resource pop-up 1
Configure a new client resource pop-up 2

If yout want to use TLS Encryption for your backups, choose the appropriate certificate(s).

The Bweb Management Suite also offers a wizard for the creation of these certificates at Configuration –> Configure Bacula –> Security Center.

However in our example we will leave this one out.

Configure a new client resource pop-up 3

The wizard will generate the configuration for your client (file daemon).

There are two ways to get this configuration onto your client:

  1. Copy and paste the output into the bacula-fd.conf of the machine that is going to be backed up.
  2. Use Bweb Management Suite’s “push configuration” option. Click on edit and deploy this newly created FileDaemon Resource.

Note: Make sure the client is accessible via password or public-key authentication if you plan on using this.

Configure a new client resource pop-up 4
Configure a new client resource pop-up 5

Once this is done go to Workset and commit your changes.

Fileset customization

To show some more of the flexibility Bacula offers we will apply a new fileset for your newly created file daemon.

One potential issue is the availability of backup space. For a Linux client it is possible to backup everything starting at the root directory /.

But do we need all that data?

Many files will be the same, for example if you use the same Linux distribution on several servers.

In most cases this doesn’t matter as we use the deduplication plugin. But if you have a specific server where you need to backup a large amount of data (e.g. everything in /opt) you should start by adding a new fileset.

In the base setup we have defined a job for /usr.

Now we will do the same for /opt.

Using the Bweb Management Console, navigate to Filesets and click on Add a new fileset resource:

Enter a name for the fileset and an optional description.

Create a new fileset resource pop-up 1

On the next screen you will be prompted with a wide variety of options, in our example we will choose:

Yes – backup all Client content followed by No – Select paths and files

Create a new fileset resource pop-up 2

Adjust the config snippet so it looks like the one in the screenshot below and click on Save.

Create a new fileset resource pop-up 2

Creating a database backup

MySQL

Backing up a database is a little more complicated than backing up a filesystem.

Bacula Enterprise has many plugins which can be used. In this example we will take a closer look at the MySQL and the MSSQLplugin.

Begin by creating a new fileset. Choose the MySQL plugin in the list down below.

Fill out the needed information and create the needed permissions and users on your backup client.

MySQL plugin pop-up 1
MySQL plugin pop-up 1

Next, create the job.

MSSQL

For Windows systems, Bacula Enterprise uses Volume Shadow Copy Service (VSS) to ensure file system consistency.

Begin by creating a new fileset. Choose the Microsoft VSS MSSQL in the list down below.

Fill out the needed information and create the needed permissions and users on your backup client.

MSSQL plugin pop-up 1

More info on database backups and restoring them can be found here.

Create the job

Next, create and configure a new job resource to use the new fileset.

Go to Configuration –> Configure Bacula and on the left hand side choose Add a new Backup.

Create new job pop-up 1
Create new job pop-up 2
Create new job pop-up 3
Create new job pop-up 4
Create new job pop-up 5

After that you will be prompted with more configuration options the newly created job. You can still edit some of the available options.

If everything fits your need, click Save.

Edit job opt

Restore files on a Linux Client using Bweb Management Suite

From the main menu go to Jobs –> Web Restore.

Select the client in the Select a client… list and job in the Select a job… list.

The directory tree will automatically show up on the directories panel.

Restore files 1

In the directories panel, browse to /usr/bin/.

The content will show up in the panel directory content.

Highlight some or all of the files in /usr/bin, and drag them to the restore selection panel located below.

Restore files 2

Click Run Restore.

Restore files 2

Restore Selection pop-up will appear.

Note: By default, your recovered files will be sent to /tmp/bacula-restores/ using the value in the Where field.

You can change this to a different path or client if you already have multiple clients configured.

Click on Run to start the job. You will be redirected to a status window. Click the Refresh button to see a green status bar that will show the job progress.

Restore files 3

When the restore is successful, you will be redirected to a job status window.

To check for the restored files, e.g. ssh to the bacula-client,change to the path you selected in the restore process, and list the contents.

Example: Backing up a directory with BConsole

This will show you how to use BConsole, a text based console interface used to interact with the Director, to run backup and restore jobs and to monitor your Bacula infrastructure.

To access BConsole, log on your server directly or via ssh.

This scenario uses the folder /home/bacula located on your server.

First, create a 200Mb file with random data which we will use as an example:

dd if=/dev/urandom of=/home/bacula/random.log bs=100M count=2

Next, enter bconsole:

/opt/bacula/bin/bconsole

In the console, type run and hit Enter.

Select the job labeled home by typing the number in front of it.

Review the job settings. Note that this job is an incremental one. You may modify parameters at run time, but in this case just type yes to start the job.

Backup with bconsole 1

You should be returned to the prompt and given an indication that the job has been queued, like Job queued. JobId=x.

To view the status of the job, type status dir.

You might see the job listed under Running Jobs or under Terminated Jobs if it is already finished. You can also see scheduled jobs and jobs which were completed in the past.

You can review the job log at any time by using the messages command.

You should now have a full backup of your /home/bacula folder. The next step is to restore this data to the default restore location, currently the folder /tmp/bacula-restores/.

Example: Restoring files with BConsole

In this scenario we are restoring data from an existing backup folder from /home/bacula which we backed up in the previous example.

Simply enter restore.

Here you will see the different ways to find what you want to restore. Type 5, which will select the last backup job run. Note that your backup client is automatically selected since there is only one client available.

Here you will see all data sets that have been run. Select the fileset labeled home by typing the number before home to select that folder.

When you hit Enter, Bacula Enteprise Edition will build a file tree from the catalog. This will let you browse through and choose the files you want to restore.

Many standard commands are available such as ls to list the directory, and cd to change directories.

To restore /home/bacula begin with:

cd /home/bacula

Type the command:

mark *

That will mark the contents of the directory for restore.

Restore from bconsole 1

When you are done selecting files and folders, type done.

Review the job settings, most importantly the Where and Restore Client directives. We can leave these alone for now, but if you ever want to restore to a different location or another client machine, here is one way to accomplish it.

Type yes to run the restore job.

To view the status of the job, type status dir.

Type messages in BConsole to review the job log. Run the messages command periodically to check the status.

When the job is complete, you will see Restore OK.

Type exit to close BConsole.

Hits: 44

Ubuntu Install Java 8

Step 1: Install Java 8 (JDK 8)
Open terminal and enter the following commands.
sudo add-apt-repository -y ppa:webupd8team/java
sudo apt-get update
echo debconf shared/accepted-oracle-license-v1-1 select true | sudo debconf-set-selections
echo debconf shared/accepted-oracle-license-v1-1 seen true | sudo debconf-set-selections
sudo apt-get -y install oracle-java8-installer

Setup JAVA Environment
sudo apt-get -y install oracle-java8-set-default

Step 2: Verify JAVA Version
After successfully installing oracle java using above step verify installed version using following command.
$ java -version
You see output similar to this.
java version “1.8.0_25”
Java(TM) SE Runtime Environment (build 1.8.0_25-b17)
Java HotSpot(TM) 64-Bit Server VM (build 25.25-b02, mixed mode)

Hits: 108