Skip to content
This repository was archived by the owner on Nov 11, 2022. It is now read-only.

Commit 7cdb503

Browse files
authored
Update README.md for Dataflow SDK 2.0.0 release (#578)
Update README.md for Dataflow SDK 2.0.0 release
1 parent c01287e commit 7cdb503

1 file changed

Lines changed: 33 additions & 47 deletions

File tree

README.md

Lines changed: 33 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -27,10 +27,11 @@ underlying source code is hosted in the
2727
[Apache Beam repository](https://github.com/apache/beam).
2828

2929
[General usage](https://cloud.google.com/dataflow/getting-started) of Google
30-
Cloud Dataflow does **not** require use of this repository. Instead:
30+
Cloud Dataflow does **not** require use of this repository. Instead, you can do
31+
any one of the following:
3132

32-
1. depend directly on a specific
33-
[version](https://cloud.google.com/dataflow/release-notes/java) of the SDK in
33+
1. Depend directly on a specific
34+
[version](https://cloud.google.com/dataflow/downloads) of the SDK in
3435
the [Maven Central Repository](http://search.maven.org/#search%7Cga%7C1%7Cg%3A%22com.google.cloud.dataflow%22)
3536
by adding the following dependency to development
3637
environments like Eclipse or Apache Maven:
@@ -41,69 +42,52 @@ environments like Eclipse or Apache Maven:
4142
<version>version_number</version>
4243
</dependency>
4344

44-
1. download the example pipelines from the separate
45+
1. Download the example pipelines from the separate
4546
[DataflowJavaSDK-examples](https://github.com/GoogleCloudPlatform/DataflowJavaSDK-examples)
4647
repository.
4748

48-
<!-- 1. If you are using [Eclipse](https://eclipse.org/) integrated development
49+
1. If you are using [Eclipse](https://eclipse.org/) integrated development
4950
environment (IDE), the
50-
[Cloud Dataflow Plugin for Eclipse](https://cloud.google.com/dataflow/getting-started-eclipse)
51-
provides tools to create and execute Dataflow pipelines locally and on the
52-
Dataflow Service. -->
51+
[Cloud Dataflow Plugin for Eclipse](https://cloud.google.com/dataflow/docs/quickstarts/quickstart-java-eclipse)
52+
provides tools to create and execute Dataflow pipelines inside Eclipse.
5353

54-
## Status [![Build Status](https://travis-ci.org/GoogleCloudPlatform/DataflowJavaSDK.svg?branch=v2)](https://travis-ci.org/GoogleCloudPlatform/DataflowJavaSDK)
54+
## Status [![Build Status](https://api.travis-ci.org/GoogleCloudPlatform/DataflowJavaSDK.svg?branch=master)](https://travis-ci.org/GoogleCloudPlatform/DataflowJavaSDK)
5555

56-
This branch is a work-in-progress for the Dataflow SDK for Java, version 2.0.0.
57-
It is currently supported on the Cloud Dataflow service in Beta.
56+
Both the SDK and the Dataflow Service are generally available and considered
57+
stable and fully qualified for production use.
5858

59-
<!--Both the SDK and the Dataflow Service are generally available, open to all
60-
developers, and considered stable and fully qualified for production use.-->
59+
This [`master`](https://github.com/GoogleCloudPlatform/DataflowJavaSDK/) branch
60+
contains code to build Dataflow SDK 2.0.0 and newer, as a distribution of Apache
61+
Beam. Pre-Beam SDKs, versions 1.x, are maintained in the
62+
[`master-1.x`](https://github.com/GoogleCloudPlatform/DataflowJavaSDK/tree/master-1.x)
63+
branch.
6164

6265
## Overview
6366

6467
The key concepts in this programming model are:
6568

66-
* [`PCollection`](https://github.com/GoogleCloudPlatform/DataflowJavaSDK/blob/master/sdk/src/main/java/com/google/cloud/dataflow/sdk/values/PCollection.java):
67-
represents a collection of data, which could be bounded or unbounded in size.
68-
* [`PTransform`](https://github.com/GoogleCloudPlatform/DataflowJavaSDK/blob/master/sdk/src/main/java/com/google/cloud/dataflow/sdk/transforms/PTransform.java):
69-
represents a computation that transforms input PCollections into output
70-
PCollections.
71-
* [`Pipeline`](https://github.com/GoogleCloudPlatform/DataflowJavaSDK/blob/master/sdk/src/main/java/com/google/cloud/dataflow/sdk/Pipeline.java):
72-
manages a directed acyclic graph of PTransforms and PCollections that is ready
73-
for execution.
74-
* [`PipelineRunner`](https://github.com/GoogleCloudPlatform/DataflowJavaSDK/blob/master/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/PipelineRunner.java):
75-
specifies where and how the pipeline should execute.
69+
* `PCollection`: represents a collection of data, which could be bounded or
70+
unbounded in size.
71+
* `PTransform`: represents a computation that transforms input PCollections
72+
into output PCollections.
73+
* `Pipeline`: manages a directed acyclic graph of PTransforms and PCollections
74+
that is ready for execution.
75+
* `PipelineRunner`: specifies where and how the pipeline should execute.
7676

7777
We provide two runners:
7878

79-
1. The [`DirectRunner`](https://github.com/GoogleCloudPlatform/DataflowJavaSDK/blob/master/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/DirectPipelineRunner.java)
80-
runs the pipeline on your local machine.
81-
1. The [`DataflowRunner`](https://github.com/GoogleCloudPlatform/DataflowJavaSDK/blob/master/sdk/src/main/java/com/google/cloud/dataflow/sdk/runners/DataflowPipelineRunner.java)
82-
submits the pipeline to the Dataflow Service, where it runs using managed
83-
resources in the [Google Cloud Platform](https://cloud.google.com) (GCP).
79+
1. The `DirectRunner` runs the pipeline on your local machine.
80+
1. The `DataflowRunner` submits the pipeline to the Cloud Dataflow Service,
81+
where it runs using managed resources in the
82+
[Google Cloud Platform](https://cloud.google.com).
8483

8584
The SDK is built to be extensible and support additional execution environments
8685
beyond local execution and the Google Cloud Dataflow Service. Apache Beam
87-
contains additional SDKs, runners, IO connectors, etc.
86+
contains additional SDKs, runners, and IO connectors.
8887

8988
## Getting Started
9089

91-
This repository consists of the following parts:
92-
93-
* The [`sdk`](https://github.com/GoogleCloudPlatform/DataflowJavaSDK/blob/master/sdk)
94-
module provides a set of basic Java APIs to program against.
95-
* The [`examples`](https://github.com/GoogleCloudPlatform/DataflowJavaSDK/blob/master/examples)
96-
module provides a few samples to get started. We recommend starting with the
97-
`WordCount` example.
98-
99-
The following command will build both the `sdk` and `example` modules and
100-
install them in your local Maven repository:
101-
102-
mvn clean install
103-
104-
After building and installing, you can execute the `WordCount` and other
105-
example pipelines by following the instructions in this
106-
[README](https://github.com/GoogleCloudPlatform/DataflowJavaSDK/blob/master/examples/README.md).
90+
Please try our [Quickstarts](https://cloud.google.com/dataflow/docs/quickstarts).
10791

10892
## Contact Us
10993

@@ -117,5 +101,7 @@ on GitHub to report any bugs, comments or questions regarding SDK development.
117101

118102
* [Google Cloud Dataflow](https://cloud.google.com/dataflow/)
119103
* [Apache Beam](https://beam.apache.org/)
120-
* [Dataflow Concepts and Programming Model](https://cloud.google.com/dataflow/model/programming-model)
121-
* [Java API Reference](https://cloud.google.com/dataflow/java-sdk/JavaDoc/index)
104+
* [Dataflow Concepts and Programming Model](https://beam.apache.org/documentation/programming-guide/)
105+
* [Java API Reference](https://beam.apache.org/documentation/sdks/javadoc/)
106+
107+
_Apache, Apache Beam and the orange letter B logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries._

0 commit comments

Comments
 (0)