Caddy and Tailscale Integration

Table of Contents

I previously decided I wanted to transfer data from a SQLite database on my laptop to my server's PostgreSQL database as part of some data exploration I was doing on screen time. In my previous post, I briefly discussed the reasoning behind this data transfer.

Quick Background

Before diving into the integration of Caddy and Tailscale, let's quickly understand what each component does. Caddy is a web server with a simple configuration syntax that prioritizes ease of use and security through automatic HTTPS. Tailscale is a VPN solution that allows you to establish secure connections between different devices and networks easily.

SQLite and SSH

There is a common pattern in data engineering to SFTP a file from a remote server and do the processing needed. This approach works when the data is stored on a server that stays on the same network and enabling SSH is not an issue.

However, the data I need exists on laptop that cannot have it's port 22 open for security reasons when going on networks I don't fully trust. I also did not want to manage more python dependencies on my laptop since I have conflicting package versions and could very easily break a dependency.

Tailscale SSH

After some time, I found a solution that fit my requirements - Tailscale SSH. Tailscale SSH allows me to securely access my laptop's data from anywhere without the need for opening ports or worrying about network security. Tailscale introduced this feature in the past year, and you can learn more about it here.

I did run into one issue related to the App Store installation of Tailscale. I had to uninstall the Mac version first, following the steps outlined here.

After following the uninstallation steps, I installed the standalone version of Tailscale, which can be found on GitHub.

Now, Tailscale hijacks SSH connections, and it automatically verifies that my device is logged into Tailscale and connects accordingly. Tailscale handles the provisioning of SSH keys behind the scenes. Additionally, finer controls can be set up with SSH, such as limiting access to specific machines!

Awesome! I wrote a pipeline to pull the data from my Mac to PostgreSQL using Dagster.

class RemoteSQLiteResource(ConfigurableResource):
      remote_path: str
      local_path: str = "./temp.db"
      # when adding new env, add to dagster.yaml too
      # when using tailscale ssh, give the tailscaled daemon full disk access
      host: str = os.getenv("MAC_HOST")

      def get_remote_file(self):
          from fabric import Connection
          from paramiko import RSAKey

          get_dagster_logger().info(f"Getting remote file from {self.host}")

          # paramiko needs password or pkey, but we are using tail scale ssh so use random key
          with Connection(self.host, connect_kwargs={"pkey": RSAKey.generate(2048)}) as c:
              c.get(self.remote_path, self.local_path)

      def get_client(self):
          self.get_remote_file()
          return connect_sqlite(self.local_path)
      
      def execute_query(self, sql: str):
          with self.get_client() as con:
              return pd.read_sql_query(sql, con=con)

  def st2Postgres(context, postgres: PostgresResource, screentime: RemoteSQLiteResource):
      table_name = "zobject"
      last_row = postgres.get_latest_row(table_name)
      df = screentime.execute_query(f"SELECT * FROM {table_name} WHERE Z_PK > {last_row}")
      context.log.info(f"Read {df.shape[0]} rows from raw table")
      last_processed = postgres.insert_df_update(df, table_name, pk="Z_PK")

Caddy and Tailscale

The majority of this post was on Tailscale, but I wanted to plug Caddy since I have found it to be an amazing piece of software!

Connecting both of the components we can access certain domains only when connected to the tailscale network, without having to rely on remembering IP and port combinations. Also, Tailscale allows you to share your network with other users with fine-grained access control logic.

Ex: https://notedwin.cow-ghost.ts.net is not publicly accessible, but it is accessible through Tailscale. :p

Below is some code from my config file:

notedwin.cow-ghost.ts.net {
	...
	handle_path /cam/* {
		reverse_proxy /* 192.168.0.108:8765
	}
	...
}

*.notedwin.com {
	...
	encode zstd gzip

	@grafana host grafana.notedwin.com
	handle @grafana {
		reverse_proxy grafana:3000
	}

	@map host map.notedwin.com
	handle @map {
		redir https://grafana.notedwin.com/public-dashboards/f58a8583eb974ecebb49c127b41ff679?orgId=1 permanent
	}

	@resume host resume.notedwin.com
	...
	handle {
		redir https://notedwin.com
	}

More information about enabling HTTPS with Tailscale can be found here, and you can read about Caddy-Tailscale integration on the Tailscale blog.

Conclusion

I was able to complete my personal health dashboard thanks to tailscale! Here is one panel from my grafana dashboard:

Grafana Dashboard