Install Atlas

Install atlas | Complete tutorial with examples in 2023

As data size grows, the complexity of managing compliance and governance grows in the same proportion. This blog will understand how to install atlas in your system.

What is apache atlas?

Before installing an apache atlas, let’s first understand what apache atlas is and why every organization must have a data governance and compliance tool.

Apache Atlas is an open-source tool used for data governance and metadata management. Apache Atlas allows companies to effectively and efficiently meet their compliance requirements.

Apache atlas’s popularity is growing because it easily integrates with well-known big data tools like Hadoop, Kafka, spark, hive, impala, etc. It also provides REST API which we can use to update and create the data lineage.

I have many pre-defined types, and users can add new types based on their requirements. It also supports a SQL-like query engine to search entities. You can check this link to understand more about the atlas features.

How to install atlas

In this session, we will learn how to install apache atlas using docker. You can follow this tutorial to install docker on your system.

Now type the below command to pull the atlas docker image

docker pull sburn/apache-atlas
Using default tag: latest
latest: Pulling from sburn/apache-atlas
d519e2592276: Pull complete 
d22d2dfcfa9c: Pull complete 
b3afe92c540b: Pull complete 
9070b09379d6: Pull complete 
968e3feb8e26: Pull complete 
4568df43ab62: Pull complete 
6cd5206cb36f: Pull complete 
7e90f6010249: Pull complete 
9646c7ee49f9: Pull complete 
57a26972c6b6: Pull complete 
4ddabc3ff1ef: Pull complete 
Digest: sha256:1eca23ef34204ee9a15ec809b695fb0a1a2a12cf68db18642c9e90875675a5c6
Status: Downloaded newer image for sburn/apache-atlas:latest
docker.io/sburn/apache-atlas:latest

Now type the below command to verify if the images get pulled successfully

docker images

Now run the docker image by typing the below command

docker run -d \
-p 21000:21000 \
--name atlas \
sburn/apache-atlas \
/opt/apache-atlas-2.1.0/bin/atlas_start.py

Verify if the docker container is running

docker ps
CONTAINER ID   IMAGE                COMMAND                  CREATED         STATUS         PORTS                                           NAMES
7092a4f8e34b   sburn/apache-atlas   "/opt/apache-atlas-2…"   3 seconds ago   Up 2 seconds   0.0.0.0:21000->21000/tcp, :::21000->21000/tcp   atlas

Go to http://localhost:21000 to access atlas UI

atlas login

use admin/admin creds to log in to the Atlas UI.

atlas home

On successful login, users will see this page. You can use different search options to filter out the required data.

Add sample data to apache atlas

When you log in to Apache Atlas UI, you won’t get any preloaded data in the atlas. But atlas provides a way to load data into the apache atlas.

Now to load the data into the docker container, go inside the docker container by typing the below command

docker exec -it 7092a4f8e34b /bin/bash

Finally, type the command to load data to apache atlas

python /opt/apache-atlas-2.1.0/bin/quick_start.py

use admin/admin as username and password

Once you run the above script, the following data will get inserted into the apache atlas.

python /opt/apache-atlas-2.1.0/bin/quick_start.py
Enter username for atlas :- admin
Enter password for atlas :- 
Creating sample types: 
Created type [DB]
Created type [Table]
Created type [StorageDesc]
Created type [Column]
Created type [LoadProcess]
Created type [LoadProcessExecution]
Created type [View]
Created type [JdbcAccess]
Created type [ETL]
Created type [Metric]
Created type [PII]
Created type [Fact]
Created type [Dimension]
Created type [Log Data]
Created type [Table_DB]
Created type [View_DB]
Created type [View_Tables]
Created type [Table_Columns]
Created type [Table_StorageDesc]
Creating sample entities: 
Created entity of type [DB], guid: c20def06-fb2a-452c-8ef9-bbd57708dede
Created entity of type [DB], guid: e3be9ce6-5eac-4e0b-9f25-01294fd72f34
Created entity of type [DB], guid: ae2d7638-9319-4ea6-b46c-e29ccf2a2079
Created entity of type [Table], guid: 4d8f5fad-0e91-4b7b-876d-7373ce8286c9
Created entity of type [Table], guid: fd4648ce-7e06-4ce2-b12b-24d1c0f38f41
Created entity of type [Table], guid: 85e75fdf-9ef7-4a88-bd97-6c9b1183de31
Created entity of type [Table], guid: ff1a2797-af24-4712-98bc-a7524a67d652
Created entity of type [Table], guid: f4de886d-00be-4664-aa3c-fa6e1f0cc44f
Created entity of type [Table], guid: fd5ec4df-a5d8-401b-91a8-1cc1435de000
Created entity of type [Table], guid: 2956330a-03c0-483e-accb-423edc39d4b7
Created entity of type [Table], guid: cfce1b26-73c5-4e16-819d-cc4d5cebb317
Created entity of type [View], guid: 7220d223-17ac-4042-9dea-9d087314c976
Created entity of type [View], guid: 274afdbe-3a40-4e25-b53c-7f8bc0efa9fa
Created entity of type [LoadProcess], guid: 9f0135c1-8c1e-4f29-a95a-50e539066d8d
Created entity of type [LoadProcessExecution], guid: 821a79af-9942-46ac-94fb-a5c4814f1cde
Created entity of type [LoadProcessExecution], guid: cfd36c58-e1e5-48df-839f-65a50a25d15a
Created entity of type [LoadProcess], guid: d68ed431-47c6-4163-8a47-6e220cee7aaa
Created entity of type [LoadProcessExecution], guid: c632fc7d-c24d-466e-997e-2be170bfd533
Created entity of type [LoadProcessExecution], guid: 9d852052-5314-4b3e-81a7-084c24949e2b
Created entity of type [LoadProcess], guid: 55d3b414-67b2-4384-bc5d-11ac1e2c3b49
Created entity of type [LoadProcessExecution], guid: b1793392-5292-4596-9fac-aeb44fe871eb
Created entity of type [LoadProcessExecution], guid: fafbe5d8-b179-425a-b85a-c6a37c0dd7ed
Sample DSL Queries: 
query [from DB] returned [3] rows.
query [DB] returned [3] rows.
query [DB where name=%22Reporting%22] returned [1] rows.
query [DB where name=%22encode_db_name%22] returned [ 0 ] rows.
query [Table where name=%2522sales_fact%2522] returned [1] rows.
query [DB where name="Reporting"] returned [1] rows.
query [DB where DB.name="Reporting"] returned [1] rows.
query [DB name = "Reporting"] returned [1] rows.
query [DB DB.name = "Reporting"] returned [1] rows.
query [DB where name="Reporting" select name, owner] returned [1] rows.
query [DB where DB.name="Reporting" select name, owner] returned [1] rows.
query [DB has name] returned [3] rows.
query [DB where DB has name] returned [3] rows.
query [DB is JdbcAccess] returned [ 0 ] rows.
query [from Table] returned [8] rows.
query [Table] returned [8] rows.
query [Table is Dimension] returned [5] rows.
query [Column where Column isa PII] returned [3] rows.
query [View is Dimension] returned [2] rows.
query [Column select Column.name] returned [10] rows.
query [Column select name] returned [9] rows.
query [Column where Column.name="customer_id"] returned [1] rows.
query [from Table select Table.name] returned [8] rows.
query [DB where (name = "Reporting")] returned [1] rows.
query [DB where DB is JdbcAccess] returned [ 0 ] rows.
query [DB where DB has name] returned [3] rows.
query [DB as db1 Table where (db1.name = "Reporting")] returned [ 0 ] rows.
query [Dimension] returned [9] rows.
query [JdbcAccess] returned [2] rows.
query [ETL] returned [10] rows.
query [Metric] returned [4] rows.
query [PII] returned [3] rows.
query [`Log Data`] returned [4] rows.
query [Table where name="sales_fact", columns] returned [4] rows.
query [Table where name="sales_fact", columns as column select column.name, column.dataType, column.comment] returned [4] rows.
query [from DataSet] returned [10] rows.
query [from Process] returned [3] rows.
Sample Lineage Info: 
sales_fact_daily_mv(Table) -> loadSalesMonthly(LoadProcess)
time_dim(Table) -> loadSalesDaily(LoadProcess)
loadSalesDaily(LoadProcess) -> sales_fact_daily_mv(Table)
loadSalesMonthly(LoadProcess) -> sales_fact_monthly_mv(Table)
sales_fact(Table) -> loadSalesDaily(LoadProcess)
Sample data added to Apache Atlas Server.

Now you can use the atlas UI to explore and analyze the preloaded data.

Conclusion

I hope you liked this tutorial on installing an atlas using the docker container. Feel free to ask your valuable questions in the comments section below.

More to read

Docker httpd image

Dockerfile tutorial for beginner

Jenkins workflow

Jenkins pipeline example hello-world

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top