Skip to content

Metabase

Created: 2017-07-07 13:51:56 -0700 Modified: 2019-08-16 13:56:20 -0700

Metabase is essentially a dashboard layer on top of database queries. For example, if you want to see how many accounts were created in the last week as a graph, you can do that using Metabase.

They suggest learning these three basic things:

All of your data is organized in Tables. Think of them in terms of Excel spreadsheets with columns and rows.

To get answers, you Ask Questions by picking a table and a few other parameters. You can visualize the answer in many ways, including cool charts.

You (and anyone on your team) can save answers in Dashboards, so you can check them often. It’s a great way to quickly see a snapshot of your business.

To get started learning all of the features of Metabase, they have a handy Users Guide.

Main site

GitHub

The only confusion I had was how to change the default web port. Apparently it’s like this (reference):

I can just use “run.cmd” for Bot Land

cd /d D:CodeBotLandbotland-metabase
set MB_ENCRYPTION_SECRET_KEY=<pull from password manager>
run

From a clean setup though, you have to do this:

set MB_JETTY_PORT=4242
java -jar metabase.jar

Setting it up for the first time should be done off-stream (if you’re streaming).

It sounds like you shouldn’t use raw SQL questions unless you can’t accomplish your goal in any other way. This should end up being very rare. By using raw queries, you lose the ability to use segments, metrics, chart options (reference).

As a developer, I like showing the column names exactly the same as they are in the database so that Metabase doesn’t do weird things like turn “atk_rating” into “A Tk Rating”:

Settings → Admin → General → Friendly table and field names → Disabled

Note that you can also change the display name on a per-table and per-column basis by updating the data model (reference).

Bitwise operators don’t seem to exist for filters. You can hack your way around this by picking multiple “Equal to” values:

The database that Metabase uses is H2 by default, not MySQL.

I think it’s a good idea to split up data into many categories, e.g. “Users”, “Inventory”, etc.

Make use of these buttons! They made them work well and there isn’t always an obvious UI button that’ll let you go back.

Suppose you want to represent this query in Metabase:

select users.id, count(1) from users join matches on users.id = matches.attacker_id group by users.id;

You can set that up this way:

Metabase doesn’t encrypt your connection information inside of ITS database by default, but you can enable it via the reference link.

set MB_ENCRYPTION_SECRET_KEY=some really secure password

^— Metabase enforces that this is at least 16 characters long

Start Metabase

New databases will be encrypted with the key and existing databases just have to be saved again

After that, your credentials will be encrypted.

An easy way to X-ray a table is: home → scroll to bottom → click database → click on the ⚡ icon.

If you find yourself wanting queries to be over many different time ranges, e.g. last week, last month, last year, then you can filter on the entire dashboard.

Let’s say you want to have a dashboard that shows number of users who logged in for the last X time:

  • Go to your dashboard
  • Add a question that looks like you’re just getting the total amount of users with no filter whatsoever (because the dashboard will add the filter):

  • Edit the dashboard with the ✏ icon and click the filter icon (I chose “relative date”)
  • You’ll see this UI:

  • The top part lets you pick a date for the whole dashboard. The bottom part shows a “Select…” dropdown that lets you filter for the individual query. So in this case, since I want the number of users who logged in for the last X days, I would pick “last login date” from that dropdown:

  • Click “Done” at the top, but even once you do that, your question won’t be updated until you save the dashboard too.

You can even do this with date ranges if you want, so you could get logins between 3 and 2 months ago.

You can also have different filters affect different parameters per question, e.g. “show the number of users who logged in during the last week who have made a purchase between May 5th and May 17th”.

A metric is something that you want to track, e.g. “total number of sign-ups”, then you can later drill down to a particular timespan.

A segment is a special named filter or set of filters.

A metric is a way to refer to a computed number that you reference often. This is used for the “view” in a custom-built query.

A segment may be something like “user who was active in the last 30 days”, which you may have tons of queries that you want to base off of that.

A metric may be something like “sum of money spent by each user”. You could bake in a filter to that so that the metric becomes “sum of money spent by users who logged in during the last week”, but if you think you’ll change the filter frequently, then I guess you should leave that up to the individual questions.

If you need to use a custom field, you can do so by clicking the three horizontal dots to the right of the query builder:

  • Visualizations have options, e.g. the ability to format a single number like “$292.1K” instead of “292100”.
  • Smart numbers are used for comparing how a number has changed over a given set of time periods
    • Note: this currently sucks for things like account creation due to this issue. Summary: comparing a sum over the last week uses the calendar week, meaning on Monday, you’ll only have one day of data, but you’ll still compare it to the last entire week, so essentially your stats will always look like they’ve dropped drastically unless you’re comparing on Saturday night.

You can overlay multiple series on top of each other (reference):

This is done by taking at least two questions with the same X axis, editing the dashboard, then clicking the ”📈 Edit” button.

Metabase provides an official Docker image:

docker run -d -p 3000:3000 --name metabase metabase/metabase

Keep in mind that if you want to connect to a database running on the host machine, you should use the —add-host directive (reference):

docker run -d -p 7678:3000 --add-host=dockerhost:192.168.1.12 --name metabase metabase/metabase

Then, when Metabase asks for the host, specify “dockerhost”. After that, I actually got another error saying “java.sql.SQLException: Access denied for user ‘root’@‘my-computer-name’ (using password: YES)“. So I just went into MySQL on my host and ran this command:

GRANT ALL PRIVILEGES ON botland.* TO ‘root’@‘my-computer-name’ IDENTIFIED BY ‘password’;

Alternatively, you could mount your local MySQL database as a volume to Docker with something like “-v /var/lib/mysql:/var/lib/mysql”.

Metabase produces its own local database with your saved queries, and you’ll want to persist this between running Metabase in Docker. Their operations guide has some details on how to do this:

  • Map a volume from the host filesystem to the container’s
  • “docker cp” to get the file out of the container manually
  • “docker commit” to get a new custom Docker image from the stopped container
  • You can point Metabase at your existing external Postgres database

Then, once the Metabase container is done, you can commit its database to Git.

ECS itself is free when backed by EC2 (although EC2 costs money), so there’s no real reason not to use ECS over vanilla EC2. Unfortunately, the process is not easy without the right instructions, and finding those was tough. They’re largely here, but I’ve detailed what I went through below.

  • IAM
    • Make a new role
    • Choose “Elastic Container Service” from the list at the bottom
    • Choose “EC2 Role for ECS”
    • Ensure that it has AmazonEC2ContainerServiceforEC2Role
  • ECS
    • Make a new cluster - EC2 Linux + Networking
      • I left the cluster empty
    • Make a new task definition
      • Task role: none
      • Here’s the JSON I ended up with for reference (note that ARNs will have to change):
{
"ipcMode": null,
"executionRoleArn": "arn:aws:iam::212785478310:role/ecsTaskExecutionRole",
"containerDefinitions": [
{
"dnsSearchDomains": null,
"logConfiguration": null,
"entryPoint": null,
"portMappings": [
{
"hostPort": 3000,
"protocol": "tcp",
"containerPort": 3000
}
],
"command": null,
"linuxParameters": null,
"cpu": 0,
"environment": [
{
"name": "MB_DB_FILE",
"value": "/tmp/metabase.db"
}
],
"resourceRequirements": null,
"ulimits": null,
"dnsServers": null,
"mountPoints": [
{
"readOnly": null,
"containerPath": "/tmp",
"sourceVolume": "metabase_db"
}
],
"workingDirectory": null,
"secrets": null,
"dockerSecurityOptions": null,
"memory": null,
"memoryReservation": 512,
"volumesFrom": [],
"stopTimeout": null,
"image": "metabase/metabase:v0.31.2",
"startTimeout": null,
"dependsOn": null,
"disableNetworking": null,
"interactive": null,
"healthCheck": null,
"essential": true,
"links": null,
"hostname": null,
"extraHosts": null,
"pseudoTerminal": null,
"user": null,
"readonlyRootFilesystem": null,
"dockerLabels": null,
"systemControls": null,
"privileged": null,
"name": "metabase"
}
],
"placementConstraints": [],
"memory": null,
"taskRoleArn": null,
"compatibilities": [
"EC2"
],
"taskDefinitionArn": "arn:aws:ecs:us-west-2:212785478310:task-definition/botland-metabase:7",
"family": "botland-metabase",
"requiresAttributes": [
{
"targetId": null,
"targetType": null,
"value": null,
"name": "com.amazonaws.ecs.capability.docker-remote-api.1.21"
}
],
"pidMode": null,
"requiresCompatibilities": [
"EC2"
],
"networkMode": null,
"cpu": null,
"revision": 7,
"status": "ACTIVE",
"proxyConfiguration": null,
"volumes": [
{
"name": "metabase_db",
"host": {
"sourcePath": "/metabase_db/"
},
"dockerVolumeConfiguration": null
}
]
}
  • Task execution role: ecsTaskExecutionRole
  • Task size: leave blank
  • Scroll down to volumes
  • Container definition
    • Name: metabase
    • Image: metabase/metabase:latest
      • Do not try putting “dockerhub” or something into the image
      • If you want a specific version, replace “latest” with something like “v0.31.2”.
    • Soft limit: 512
      • This means that ECS will reserve at least that many MB for the container
    • Port mappings: 3000 → 3000
    • Environment
      • Key: MB_DB_FILE
      • Value: /tmp/metabase.db
    • Mount points
      • Volume: metabase_db (this only shows if you properly configured it on the last screen as mentioned above)
      • Path: /tmp
    • Leave everything else blank
  • At this point, I had to finish configuring the task definition with JSON in order to set up a bind mount (so that Metabase could persist its data to EBS) (AWS reference, Metabase reference):
    • Add this inside the container definition:
"mountPoints": [
{
"sourceVolume": "/metabase_db",
"containerPath": "/tmp"
}
],
"environment": [
{
"name": "MB_DB_FILE",
"value": "/tmp/metabase.db"
}
],
Create a service in your cluster
  • Launch type: EC2

  • Task definition: the TD you made earlier

  • Number of tasks: 1

  • Minimum percent: 0

  • Maximum percent: 100

  • Nothing else really matters enough to customize here

  • No load balancer needed, no auto scaling needed

  • EC2

    • [Note: it may be wise to make a launch template for this]
    • Make a new instance
    • Choose an ECS-optimized community AMI using this list. They’re free, so don’t just search for “ECS” or you may end up paying for the AMI.
    • Instance type: t2.micro
    • Subnet: pick a public subnet so that you can access the instance. Ensure that “Auto-assign Public IP” is set to “Enable” after that, otherwise you picked a bad subnet.
    • IAM role: choose the IAM role you created earlier
    • Enable termination protection: enable (so that you don’t destroy the EBS volume and lose all data)
    • Way at the bottom of Step 3: Configure Instance Details, you’ll see Advanced Details. Include this text:

#!/bin/bash

echo ECS_CLUSTER=your_cluster_name >> /etc/ecs/ecs.config

(make certain that you replace “your_cluster_name” above.

  • Storage: add EBS for 30 GB (that’s the minimum; 8 for the root, 22 for data). If it doesn’t already say “30 GB”, then you probably didn’t pick the right AMI.

  • Tag the instance with “Purpose: Metabase for ECS”.

  • Configure security group to be something that grants access to the IPs that you want to access this from. You’ll probably want SSH access and TCP on port 3000 (Metabase’s default port).

    • After the security group is created, modify your database’s security group to allow access from Metabase.
  • ECS

    • Choose your cluster
      • Note: if you see a cluster named “default” that wasn’t there before, then you picked the right AMI, but you didn’t specify the ECS_CLUSTER variable correctly.
    • Make sure you have a container instance now that the EC2 instance is up and running.
    • I force-started the task by updating the service to “force new deployment” with 1 task.
  • Browser

    • Load https://EC2_IP_HERE:3000
    • If that doesn’t work, SSH into the instance and do “docker ps -a”, then “docker logs <container ID>“.
    • If you see “Metabase Initialization COMPLETE”, then try doing “curl http://localhost:3000” from the host.
      • If that doesn’t work, try it from inside of the container:
      • If it works inside the container, then the port just isn’t exposed to the world (so you probably didn’t configure the task definition correctly).

At this point, you can stop/start your EC2 instance whenever you want, and ECS will detect that it’s online and start the Metabase Docker task in it (although this could take a little bit of time, in which case you can always force a new deployment through ECS). Thanks to setting up a bind mount, any dashboards/questions/settings will be saved to the EBS volume, which is unaffected by stopping the corresponding EC2 instance.

Remember that restarting an EC2 instance gives it a new IP address, so if that bugs you, use Route53 to set up a CNAME or something.

Note: I tried terminating an EC2 instance once to see if I could set up the database on a new machine, but I messed up the instance details (the #!/bin/bash script), so an instance just never left the draining state and a task never seemed to be completely stopped. I ended up deleting and recreating the service before realizing that I messed up the instance details, so I don’t know if there are any special instructions around termination that I didn’t mark down.

I actually ended up deleting the entire cluster (and then of course completely deleting the EC2 instance to give it different instance details since I don’t think simply stopping/restarting is enough; the text file never got updated and I couldn’t even modify it with “sudo” (although I suck at Linux)).

Also, the volume you set up produces /metabase_db/metabase.db/metabase.db.mv.db on the host machine (which is what EBS is connected to), so you can always save that database and you should be good to delete EBS completely (and save $3/month). SCPing it to your local computer is probably the easiest way to go:

Saving the file:

scp -i path/to/key/file ec2-user@EC2_IP_ADDR:/metabase_db/metabase.db/metabase.db.mv.db ./

Uploading the file again:

ssh -i path/to/key/file ec2-user@EC2_IP_ADDR “sudo mkdir -p /metabase_db/metabase.db/”

scp -i path/to/key/file ./metabase.db.mv.db ec2-user@EC2_IP_ADDR:/metabase_db/metabase.db/

↑ This may fail with permission denied, in which case just copy it to /home/ec2-user, then do a “sudo mv /home/ec2-user /metabase_db/metabase.db/”

Using the file you get from AWS locally should be impossible because your database shouldn’t be accessible to the outside world (and it’s 100% impossible if your database is hosted on Aurora, because Aurora enforces that your database is only internally accessible). In that case, Metabase will just fail on startup because it can’t contact the database. Regardless, here’s how I even got to that point of failure on Windows:

  • Enable shared drives in Docker settings on Windows. I did this for my “E:\ drive.
  • Create E:testeronimetabase.db\
  • Copy metabase.db.mv.db there
  • docker run -d -p 7678:3000 -v e:/testeroni:/tmp -e “MB_DB_FILE=/tmp/metabase.db” —name metabase metabase/metabase

Just copy “metabase.db.mv.db” from wherever you ran metabase.jar and you’re done.

Pointing at a different database with the same queries

Section titled Pointing at a different database with the same queries

I don’t know if there’s an official way to do this, but I did this (update: I think these steps only work if you can reach both the original database and the one you’re going to change it to):

  • Click your user icon at the upper right
  • Go to the Admin Panel
  • Click “Databases” at the top center
  • Click your database
  • Change the database name or details

No results returned for “previous X days” or “next X days”

Section titled No results returned for “previous X days” or “next X days”

“Previous” and “next” do not include the current day. I couldn’t figure out how to get a question that returns today AND the previous X days. Instead, I had to go into raw SQL and do something like this:

SELECT *

FROM users

WHERE creation_date > NOW() - INTERVAL 7 DAY;

Relevant open issues as of 7/7/2017:

https://github.com/metabase/metabase/issues/2496

https://github.com/metabase/metabase/issues/4083

Almost all of the date-related questions that you can make by default will have “NOW()” in them, which takes your timezone into account. If you use UTC datetimes in your database like I do, then you can refer to this to change your timezone globally (SET GLOBAL time_zone = ‘+00:00’;), then restart Metabase altogether.

I found out that you need to select DATETIMEs with the “date()” function, e.g.

CORRECT: SELECT count(*) AS count, date(users.creation_date) AS creation_date

INCORRECT: SELECT count(*) as count, creation_date

(note: backticks shouldn’t really matter)

For this, press F12 in Chrome, go to the Network tab, and look at the POST response to see what the error is exactly.

A new question caused a dashboard to produce a JavaScript error

Section titled A new question caused a dashboard to produce a JavaScript error

E.g. “Failed to execute ‘removeChild’ on ‘Node’: parameter 1 is not of type ‘Node’.”

First, the simple solution: just refresh the page (i.e. don’t go back and then forward; actually press F5 or ctrl+R).

Second, the hard solution: Home → Browse all items → Find the troublesome question and edit it through this interface

I assume this is a temporary bug in version 0.31.2, but if you zoom in on Chrome to 250%, you can click the gear/X icons all of a sudden.

Or maybe the bug is that if you move even a single pixel while clicking, it will invoke a drag action rather than a click.

Every once in a while, I’ll have cards like these:

It didn’t make sense why I’d get a discrepancy between what looked like two identical queries. It’s because the relative query on the left represents calendar days, not 7 days’ worth of hours. You can see this if you use the button at the upper right that lets you see the underlying SQL query (just showing a snippet here):

[…] date(`users`.`creation_date`) BETWEEN date(date_add(now(), INTERVAL -7 day))
AND date(date_add(now(), INTERVAL -1 day)) […]

Metabase syncs occasionally, so it’s possible that you’re only running a container for such a brief time that it’s never syncing. To manually sync, click on the right side (note: I already did click it, so the button turned green):