Contact us

Let‘s talk about data analytics for your company

Contact us

Snowflake Events May 2024: Your Burning Questions, Answered!

From Data Chaos to Clarity – Unveiling Snowflake’s Unified Cloud Data Platform

May 7-8, 2024

During the Snowflake events in Vilnius and Riga on May 7-8, 2024, attendees were given the opportunity to pose questions using Slido. While not all questions received responses, we’ve compiled them below. A special thanks to our Snowflake speaker and workshop lecturer, Olli Ek for providing the answers.

Data Management and Governance

What are your preferred tools to create data models in Snowflake?

This depends on what are the exact requirements. For logical/conceptual modeling, at least in Finland, we often see Ellie (ellie.ai) used often but naturally there are many other tools in that scene as well. Probably the most common solution to actually build the required layers inside Snowflake is dbt.

For full list of tools within Snowflake ecosystem you can check this page.

How is data lineage organized? How to know where this data from source is used (impact) or where this data in report is sourced from (lineage)?

Snowflake currently offers capabilities for tracking lineage for all data operations that are executed on Snowflake platform. But in case you run some of the operations e.g. on the cloud layer (Azure, AWS, GCP) we cannot track those. In those cases we often see customers to use data catalog tools like Collibra, Alation or several others (open source tools available as well).

Some links for getting more information on lineage in Snowflake, offered under our Horizon capabilities:

DATA LINEAGE: DOCUMENTING THE DATA LIFE CYCLE

TAG_REFERENCES_WITH_LINEAGE

A preview feature (so do not make any decisions based on this information) we have a more advanced lineage, with UI, capabilities in Private Preview currently. We might hear additional details during Snowflake Summit in June (no promises). A preview of the feature looks like this (the final solution might differ from this):

Seems that Snowflake wants to be a one-stop shop. Are there any plans to add data lineage capability?

Yes, see above answer.

Please briefly explain the Data governance you are applying for your customers.

Data Governance is a very wide topic, hard to explain briefly so the best place to start would be this.

Can you share Snowflake’s roadmap to understand where you are heading?

Please reach out to Snowflake team (Mathias, Janne, Markus, Olli) and we can agree how we can share with you what we are building next.

 

Security and Compliance

How security can be synced with Snowflake and running on it app?

This question is a bit unclear, can you rephrase the question and I’m happy to provide you with additional details.

How security is ensured when sharing data between organizations?

The organization who shares the data has full control over security settings on shared data. They also see what kind of queries is executed on the shared data. There are various different methods, like masking, row access policies, secure views etc that are often used when sharing data.

In addition, we also have a solution for building Data Clean Rooms which can be used by 2 or more parties to share and combine data sets without any PII / details exposed to any of the parties (technically we run blind joins and other advanced features to secure that kind of setup). You can see more from here.

Is it possible to obfuscate data on the fly when cloning db?

Yes, there are different ways to achieve this. We’ll be happy to understand your requirements to give the exact design principles and code examples how to achieve this.

 

Is it possible to limit the access to shared data to Snowflake users only? To avoid paying for access via UI.

This question is a bit unclear. Snowflake offers very advanced capabilities to limit access to the platform and data by RBAC, row access policies and other similar features but we also offer network-level security where you can whitelist certain IP’s or apply geofencing security around the data etc. So we’d be happy to understand more of your requirements to provide a more concrete answer.

 

Data Sharing and Migration

What if a purchase is a dataset (static), and one wants to have a copy, without fear of provider will revoke access – regarding data sharing inside Snowflake platform.

You can make a copy of the data you receive by share and then you own that data (and it’s stored locally on your Snowflake) and will not be deleted if the provider revokes the access.

Once data is shared with another Snowflake client, what is the process of this secondary client sharing the same data with a third Snowflake client?

Shared databases and all the objects in the database cannot be re-shared with other accounts. You would need to do an extra step and create a new share from a local version of this shared data (make sure to read your rights when starting to use a share). This page will offer more information.

How easy is it to migrate from classical solution, Oracle/MS, etc., to Snowflake? And contrary, how easy is it to come back?

As a customer you own your data and Snowflake does not restrict you exporting that data out in any way. Actually it’s a very easy process to export your data from Snowflake to a cloud bucket: This page shows how it’s done to S3 (same process applies to Azure/GCP).

The difficulty of a migration depends how the current system looks like. We have done 1000’s of migrations from Oracle, MS SQL and pretty much all other databases and we can show our best practises, help with code conversion and together with Infotrust make the migration as efficient as possible.

Is it possible to share data/services stored in Snowflake to another region worldwide (to another cloud region)?

Yes, you can share from any region/cloud to any other region/cloud. If you want to run this in as automated fashion as well Snowflake Listings is the way to achieve that – see here.

 

Platform Capabilities and Features

Could you explain what data engineering capabilities Snowflake offers?

A good place to start looking into this can be found from here. We’re happy to talk in more detail about this.

Also, you can for example try these Quickstart labs:

Data Engineering Pipelines with Snowpark Python

Data Engineering with Snowpark Python and dbt

Intro to Data Engineering with Snowpark Python

Talking about apps and programs, does Snowflake have a native versioning solution?

Yes, The Snowflake Native App Framework enables providers to update a Snowflake Native App to add new functionality, fix bugs, and make other changes. More here.

Are there plans to allow Snowpark code to run completely locally on your machine with some mock data/files?

Yes, The Snowpark Python local testing framework allows you to create and operate on Snowpark Python DataFrames locally without connecting to a Snowflake account. Read here.

Kudos on allowing Iceberg catalog option outside of Snowflake. Any plans to extend it to Hudi or Delta, to increase potential customers?

As of now we’re focusing on Iceberg only – please check back after some time and we can provide additional details if we’re planning to add support for other table formats.

Are there clients who use Snowflake as a transactional DB and not as DWH? Is it a proper use case?

This is currently being done in several places but the platform is not ready for all transactional DB use cases yet so we’d love to talk more about your specific need and we will then tell if it’s a viable use case or not. Having said that, Unistore will open up a lot of new possibilities and you can read more about it from here and here (in preview).

Shared data sources – is it possible to use shared tables within the query along with personal tables (e.g., in joins)?

Yes.

 

Setup, Training, and Initial Queries

Snowflake. What shall we begin with and what do we need to set it up (except money of course)?

We’re happy to arrange a session with you together with Infotrust to help you get started and setting everything up according to our best practises.

 

How long does it take to become a Snowflake expert? Need to understand the average lead time to set up and apply your solution.

As Snowflake is driving for extreme simplicity the learning curve is very quick. We offer a lot of free training from Snowflake University.

In few days you can get all the information you need to get started successfully. Together with Infotrust we can also provide you with an onsite training if that’s a preferred method.

 

Reminder:

RBAC setup and management Olli promised to send to listeners. So, here it is.

Snowflake May 2024 Vilnius and Riga events presentations and videos.

Snowflake May 2024 Vilnius and Riga events in pictures.


Ask us

Ask our consultants

Contact us

Get data analytics news into your inbox

Infotrust team once a month shares BI news, products updates, technology trends and invitations to events and trainings. Mark your interests: