Skip to content

Commit fd3d098

Browse files
committed
Sample for using SQL and CSV files
1 parent fe9fd0b commit fd3d098

2 files changed

Lines changed: 13 additions & 0 deletions

File tree

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,7 @@ To be added soon, stay tuned!
5757
- [compras_top_ten_countries.py](spark/compras_top_ten_countries.py)
5858
- [helpers.py](spark/helpers.py) - basic parse functions to get started quickly
5959
- Spark SQL
60+
- [compras_sql.py](spark/compras_sql.py)
6061
- [container.py](spark/container.py)
6162
- [container_convertir_a_parquet.py](spark/container_convertir_a_parquet.py)
6263
- [container_rdd_to_dataset.py](spark/container_rdd_to_dataset.py)

spark/compras_sql.py

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
from os import walk
2+
3+
from pyspark.sql import SparkSession
4+
5+
spark = SparkSession.builder.master("local").appName("SQL").getOrCreate()
6+
7+
df = spark.read.option("delimiter", "|").option("header", "true").csv('./data/compras_tiny.csv')
8+
df.printSchema()
9+
df.show()
10+
11+
df.createOrReplaceTempView("compras")
12+
spark.sql("SELECT tx_id, SUM(item_price) as tx_total FROM compras GROUP BY tx_id").show()

0 commit comments

Comments
 (0)