New Institute for Applied AI Opens in Waterloo

The University of Waterloo, known for its intensive STEM focus, has graduated big tech names including PHP coding script creator Rasmus Lerdorf and BlackBerry founder Mike Lazaridis. The school is…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




Connect to AWS Athena using Datagrip

Go to the AWS IAM console and create a user with the following inline policy (or attach the policy to an existing user):

Don’t forget to replace THE-S3-BUCKET-TO-QUERY with the actual S3 bucket that Athena uses to query and THE-S3-BUCKET-TO-STORE-QUERY-RESULTS with an S3 bucket that Athena can use to store its query results. When you query Athena using the AWS console, it’s something like s3://aws-athena-query-results-1234567890-eu-west-1. Make sure to create and download an Access Key for that user. You can do that by going to that user, and then to the Security Credentials tab.

Once there, click the Access Key button.

Copy the Access key ID and the Secret access key somewhere. Those will be the username and password you need to connect to Athena in Step 3.

If you don’t see all the tables you expected, you can go to the Schemas tab and select the schemas you need.

That’s it. Happy querying massive amounts of data on S3 from the convenience of Datagrip. Do be wary of the costs. In Athena, you pay $5 per TB scanned. And in Datagrip, you don’t see how much data each query scanned. As a general tip:

In reality, we’ve observed Athena costs only very little for what it delivers. Remember a small RDS database quickly costs about $35 per month. An example aggregation query on a table of 1 billion rows, stored in Parquet format, scanned about 1GB. Remember it only scans the columns it needs in the aggregation and Parquet compresses really well. It took 6 seconds to execute that query. To compare that to a small RDS database, with Athena, we can run 7000 of those queries for the same $35. Asking a small RDS database to run 7000 aggregation queries on 1 billion rows will probably take several months to execute. And would need a lot more storage than 1GB. :-)

Add a comment

Related posts:

Mass Adoption Hurdles

Bullish sentiment sweeps across the crypto world and the charts are showing more green than Kermit the frog in the shower. The conversations of crypto hitting the masses are once again surfacing. It…

I now walk into the wild

Some time in the early nineties I threw out the TV. Those of you who like myself were grown-ups in the nineties might appreciate this was a quite profound thing to do. Ok, according to the annals…

Top 5 Python Development Frameworks

There are numerous advantages and disadvantages to using various Python frameworks, but choosing the right one for your project is easy. In this article, we will discuss the advantages of Python…