Archiving Logs

This guide covers how to use the archiving feature located under the Settings pane of the LogDNA Web App.

Overview

Archiving is an automatic function that exports your logs from LogDNA to an external source. Archived logs are in JSON format and preserve metadata associated with each line. Once archiving is configured for your account, your logs will be exported daily in a compressed format (.json.gz). The first time you configure archiving, your archived logs will typically appear within 24-48 hours.

Types of Data Archiving

LogDNA currently supports daily archiving for customers, but will be moving to hourly archiving in 2020.

Daily Archiving

Turning on archiving for your account will automatically enable the daily archive method. This method creates 1 Gzip JSON file per day. For days that have no logs, no files will be created. Daily archiving method works well for accounts on the smaller side. These files will stored under /<accountID>.<YYYY>-<MM>-<DD>.<clusterId>.json.gz for most providers and YYYY/MM/<accountID>.<YYYY>-<MM>-<DD>.<clusterId>.json.gz for the S3 provider.

Hourly Archiving (coming soon)

If you have a larger account with large archive files, hourly archiving may be a better option for you. Hourly archives create 24+ Gzip JSON files per day. If there are no logs, then no files will be uploaded for that hour. Archives are expected to appear within 24 hours but may take up to 72 hours for larger customers. The file contents will stay the same as the daily archives, except now the file will be stored as year=YYYY/month=MM/day=DD/<accountID>.<YYYY>-<MM>-<DD>.<HH>00.json.gz (where HH is hours in 24 format) for all providers.

Note: If log lines are attributed or received 6 hours beyond the hour bucket the log line belongs to, subsequent archive files will be created in the name format of <accountID>.<YYYY>-<MM>-<DD>.<HH>00.<NUMBER>.json.gz, where <NUMBER> is an incrementing number starting from 1, to prevent filename conflicts.

Hourly archive may create duplication of logs in the storage (1% of the time). You can tell a line has been duplicated if the lines have the same logline ID.

AWS S3

To export your logs to an S3 bucket, ensure that you have an AWS account with access to S3.

Create and Configure your bucket

  1. In AWS > S3, create a new bucket
  2. Give your bucket a unique name and select a region for it to reside in
  3. Click the Next button until you can create the bucket
  4. Configure the bucket permissions by managing the Access Control List
  5. Click the Add Account button and enter [email protected] as the email
  6. Instead of an email address, you can also use this identifier: 659c621e261e7ffa5d8f925bbe9fe1698f3637878e96bc1a9e7216838799b71a
  7. Check all available boxes

Configure LogDNA

  1. Go to the Archive pane of the LogDNA web app
  2. Under the S3 Archiving section, input the name of your newly created S3 bucket, and click Save.

Azure Blob Storage

To export your logs to Azure Blob Storage, ensure that you have an Azure account with access to storage accounts.

  1. Create a Storage Account on Microsoft Azure
  2. Once created, click your storage account and then click Access Keys under the heading Settings
  3. Create a key if you do not already have one
  4. Go to the Archive pane of the LogDNA web app
  5. Under the Azure Blob Storage archiving section, input your storage account name and key and then Click Save.

Google Cloud Storage

To export your logs to Google Cloud Storage, ensure that you have a Google Cloud Platform account and project with access to storage.

  1. Ensure that Google Cloud Storage JSON API is enabled.
  2. Create a new bucket (or use an existing one) in Google Cloud Storage.
  3. Update the permissions of the bucket and add a new member [email protected] with the role of Storage Admin.
  4. Go to the Archive pane of the LogDNA web app.
  5. Under the Google Cloud Storage Archiving section, input your ProjectId and Bucket and then click save.

OpenStack Swift

To export your logs to OpenStack Swift, ensure that you have an OpenStack account with access to Swift.

  1. Set up Swift by following these instructions.
  2. Go to the Archive pane of the LogDNA web app.
  3. Under the OpenStack Swift Archiving section, input your Username, Password, Auth URL, and Tenant Name and then click Save.

Digital Ocean Spaces

To export your logs to Digital Ocean Spaces, ensure that you have a Digital Ocean account with access to storage.

  1. Create a new space (or use an existing one) in Digital Ocean Spaces.
  2. Create a new spaces access key in Digital Ocean Applications & API. Make sure to save the access key and secret key.
  3. Go to the Archive pane of the LogDNA web app.
  4. Under the Digital Ocean Spaces Archiving section, input your Bucket, Region, AccessKey, and SecretKey. Note that your region can be found in your spaces url e.g. https://my-logdna-bucket.nyc3.digitaloceanspaces.com has the region nyc3.

IBM Cloud Object Storage Archiving

To export your logs to IBM Cloud Object Storage Archiving, ensure that you have an IBM Cloud account with access to storage.

  1. Create a new object storage service (or use an existing one) in IBM Cloud Object Storage.
  2. Create a new bucket (or use an existing one) in your service for LogDNA dump files.
  3. Go to the Archive pane of the LogDNA web app.
  4. Under the IBM Cloud Object Storage Archiving section, input your Bucket, Public Endpoint, API Key, and Resource Instance ID and then click Save.

Security

By default, LogDNA encrypts your archived data in transit, and requests server-side encryption where possible, including using x-amz-server-side-encryption upon upload of logs to S3.

Reading archived logs

Log files are stored in a zipped JSON lines format. While we do not currently support re-ingesting historical data, there are a number of tools we can recommend to parse your archived logs.

Amazon Athena

Amazon Athena is a serverless interactive query service that can analyze large datasets residing in S3 buckets. You can use Amazon Athena to define a schema and query results using SQL. More information about Amazon Athena is available here.

Google BigQuery

Google BigQuery is a serverless enterprise data warehouse that can analyze large datasets. One of our customers, Life.Church, has generously shared a command line utility, DNAQuery, that loads LogDNA archived data into Google BigQuery. More information about Google Big Query is available here.

IBM SQL Query

IBM SQL Query is a serverless data processing and analytics service for large volumes of data stored on IBM Cloud Object Storage. You can use it to query and transform as is on object storage using SQL. You can optionally also define a table definition and query that one. See this blog article for details of using IBM SQL Query with LogDNA data in IBM Cloud.

jq

jq is handy command line tool used to parse JSON data. Once your archive has been uncompressed, you can use jq to parse your archive log files. More information about jq is available here.

Updated 2 months ago


Archiving Logs


Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.