Skip to main content

Quickstart

This tutorial will introduce you to about 90% of the concepts you'll need on a daily basis.

You will learn
  • How to run a query
  • How to define a data package with a live connection to a data source
  • How to publish your package for others in their preferred runtime language
  • How to install with a package manager & import into a code base
  • How to write and run advanced analytics queries and single row lookups
  • How to update your data package

Install the dpm Client

If you'd prefer, you can find all of our binaries for download on our GitHub page.

Download and install dpm

brew tap patch-tech/tap
brew install patch-tech/tap/dpm

Log into dpm using the CLI

dpm login # in a new terminal window and follow the prompts to log in with GitHub 

Run your first query

It's time to query data! Follow along in your preferred runtime language.

  1. Create a new project

    mkdir demo-project
    cd demo-project
    npm init -y
    npm install node-ts
  2. Run the following to build demo package, it will install in a folder /dist in the current working directory.

    dpm build-package -p "Snowflake Demo Package (fast)@0.1.0" nodejs # you can use the -o flag to specify a different output directory
  3. Install the demo package. If you used the -o flag in step 2, be sure to use that pathname here.

    npm install ./dist/nodejs/snowflake-demo-package-fast-0.1.0-0.2.0.tgz
  4. Create a file to run the query. You're welcome to use your favorite editor, but here's a quick way to do it from the command line:

    cat > first_query.ts # At the prompt paste the following snippet and Ctrl-D to save
  5. Paste the following code into your terminal. Be sure to Ctrl-D to save.

    import { FactsAppEngagement as FactsAppEngagementSnow } from 'snowflake-demo-package-fast';

    // Get avg time in app and user counts
    // broken down by app and day of week
    async function main() {
    let { appTitle, foregroundduration, panelistid, starttimestamp } = FactsAppEngagementSnow.fields;

    let query = FactsAppEngagementSnow.select(
    appTitle.as("App_Name"),
    foregroundduration.avg().as("Avg_Time_in_App"),
    panelistid.countDistinct().as("User_Count"),
    starttimestamp.day.as("Day_of_week")
    )

    await query.compile().then((data)=> console.log("Compiled query: ", data));
    await query.execute().then((data)=> console.log(data));
    }

    main().catch(console.error);
  6. Run the code!

    npx ts-node first_query.ts

Create and query your own package

To get started, you will

  1. Connect to a data source
  2. Define a data package
  3. Build the package
  4. Publish the package

Create a source

dpm currently supports one source: Snowflake. Others wil be supported soon.

  1. Navigate to a directory where you'd like to consume your Snowflake data. If you'd like reuse the demo project, then navigate to that directory.
    cd ~/dpm-demo
  2. Run the following to create a Snowflake source. You will need to replace the <> with your own values. Try dpm source create snowflake --help for more information.
    dpm source create snowflake --name <> --organization <> --account <> --database <> --user <> --password <>

Check out the docs on creating sources for more details.

Create your first package

  1. Run the following to generate a descriptor file, called datapackage.json. Provide the source name from the previous step and give your package a name as well.
    # replace PACKAGE_NAME a display name for your package, 
    # SOURCE_NAME with the source name from the previous step, and
    # TABLE_NAME with the name of a table in your source (you can pass the option multiple times)
    dpm init --package-name PACKAGE_NAME SOURCE_NAME snowflake --table TABLE_NAME
  2. Publish the data package to dpm. This will make it reviewable on the Packages screen.
    dpm publish # in the directory with the `datapackage.json` file
  3. Then build the client library from the datapackage.json file. You can find the version of the package in the datapackage.json file.
    dpm build-package -p PACKAGE_NAME@version nodejs # run with `python` for a python client library

This will write the built artifact to ./dist/{target}/{versioned-package}/ by default, using the version value in your descriptor. You may override the output directory with the --out-dir <path> option on dpm build-package.

Import and run queries

  1. Share the library locally and import it into your project.
    npm install ./dist/nodejs/YOUR_PACKAGE_NAME-1.0.0.tgz # you can python -m pip install ./dist/python/YOUR_PACKAGE_NAME-1.0.0.tar.gz for python
  2. Then, from a module in same directory as the above command.
    import {Table1, Table2, Table3 } from './YOUR_PACKAGE_NAME'// In your project's source code
  3. Write and run some queries, using the template above if you need.

See here for further details on sharing your package.

Update your package

Over time, you'll need to update your package. For example, you may need to evolve the schemas in your source backend.

The following command refreshes the contents of a descriptor. This leverages build information stashed in the datapackage.json file to recall the parameters that originally generated the descriptor: patch --dataset <dataset_name> or snowflake --table <table_name>... in the above examples. This also applies an appropriate version bump according to the semantic versioning system.

dpm update ./datapackage.json

The command shows a diff and prompts (y/n) by default; skip with -y. Then, it writes the new descriptor to $PWD/datapackage.json by default; override with -o <path>.

From there, you can proceed with the same build workflow as before.