Complete group-based policy architecture write-up for Developer's Guide

author Ethan Spiegel <emspiege@us.ibm.com>

Thu, 18 Sep 2014 09:34:44 +0000 (02:34 -0700)

committer Ethan Spiegel <emspiege@us.ibm.com>

Thu, 25 Sep 2014 01:19:15 +0000 (18:19 -0700)
author Ethan Spiegel <emspiege@us.ibm.com>
Thu, 18 Sep 2014 09:34:44 +0000 (02:34 -0700)
committer Ethan Spiegel <emspiege@us.ibm.com>
Thu, 25 Sep 2014 01:19:15 +0000 (18:19 -0700)
diff --git a/manuals/developers-guide/src/main/asciidoc/groupbasedpolicy.adoc b/manuals/developers-guide/src/main/asciidoc/groupbasedpolicy.adoc

index 03c74386c2e224956f38eba307148b16ce82b75e..57e2407607a060935e236e3bc905005723d00211 100644 (file)
--- a/manuals/developers-guide/src/main/asciidoc/groupbasedpolicy.adoc
+++ b/manuals/developers-guide/src/main/asciidoc/groupbasedpolicy.adoc
@@ -1,4 +1,902 @@
-== Group Based Policy
+== Group-Based Policy
  
-Chapter on Group Based Policy
+This chapter describes the Group-Based Policy project. The Group-Based Policy project defines an application-centric policy model for OpenDaylight that separates information about application connectivity requirements from information about the underlying details of the network infrastructure.
+
+=== Group-Based Policy Architecture Overview
+
+.Group-Based Policy Architecture
+
+image::Group-based_policy_architecture.png[Group-Based Policy Architecture]
+
+State repositories (blue) communicate using MD-SAL (orange) with external orchestration systems as well as internally with renderers (green) through the renderer common framework (red).
+
+The components of the architecture are divided into two main categories. First, there are components that are responsible for managing the policy, configuration, and related state. These are the components that deal with the higher-order group-based policy that exists independent of the underlying infrastructure. Second, the renderer components that are responsible for applying the policy model to the underlying network infrastructure. The system can support potentially a variety of renderers that may have very different sets of features and different approaches for enabling the policy that the user has requested.
+
+The key to understanding the architecture is to first understand the policy model -- much of the design of the system flows directly from the design of the policy model.
+
+=== Policy Model
+
+The policy model is built around the idea of placing endpoints into groups that share the same semantics, and then defining what other groups those endpoints need to communicate, and then finally defining how these endpoints need to communicate. In this way, we represent the requirements of the application and then force the infrastructure to figure out how to meet these requirements, rather than defining the policy in terms of the underlying infrastructure.
+
+==== Policy Concepts
+
+This section describes some of the most important concepts in the policy model. See the next section on <<policy_resolution,Policy Resolution>> for a description of how these fit together to determine how to apply the policy to the network.
+
+Endpoint::
+An _endpoint_ is a specific device in the network. It could be a VM interface, a physical interface, or other network device. Endpoints are defined and assigned to endpoint groups through mechanisms that are not specified by the policy model (See <<endpoint_repository,Endpoint Repository>> for more information). Endpoints can have associated _conditions_ that are just labels that represent some potentially-transient status information about an endpoint.
+Endpoint Group::
+_Endpoint groups_ are sets of endpoints that share a common set of policies. Endpoint groups can participate in _contracts_ that determine the kinds of communication that is allowed. They also expose both _requirements_ and _capabilities_, which are labels that help to determine how contracts will be applied. An endpoint group is allowed to specify a parent endpoint group from which it inherits.
+Contract::
+_Contracts_ determine which endpoints can communicate and in what way. Contracts between pairs of endpoint groups are selected by the contract selectors defined by the endpoint group. Contracts expose _qualities_, which are labels that can help endpoint groups to select contracts. Once the contract is selected, contracts have _clauses_ that can match against requirements and capabilities exposed by endpoint groups, as well as any conditions that may be set on endpoints, in order to activate _subjects_ that can allow specific kinds of communication. A contracts is allowed to specify a parent contract from which it inherits.
+Clause::
+_Clauses_ are defined as part of a contract. Clauses determine how a contract should be applied to particular endpoints and endpoint groups. Clauses can match against requirements and capabilities exposed by endpoint groups, as well as any conditions that may be set on endpoints. Matching clauses define some set of _subjects_ which can be applied to the communication between the pairs of endpoints.
+Subject::
+_Subjects_ describe some aspect of how two endpoints are allowed to communicate. Subjects define an ordered list of rules that will match against the traffic and perform any necessary actions on that traffic. No communication is allowed unless a subject allows that communication.
+
+[[policy_resolution]]
+==== Introduction to Policy Resolution
+
+There are a lot of concepts to unpack and it can be difficult to see how this all fits together.  Let's imagine that we want to analyze a particular flow of traffic in the network and walk through the policy resolution process for that flow.  The key here is that the policy resolution process happens logically in three phases.  First, we need to select the contracts that are in scope for the endpoint groups of the endpoints of the flow.  Next, we select the set of subjects that apply to the endpoints of the flow.  Finally, we apply the rules from the applicable subjects to the actual network traffic in the flow.
+
+Note that this description gives a semantic understanding of how the policy model should be applied.  The steps described here may or may not correspond to an actual efficient implementation of this policy model.
+
+===== Contract Selection
+
+The first step in policy resolution is to select the contracts that are in scope.  For a particular flow, we look up the endpoint groups for each of the endpoints involved in the flow.
+
+Endpoint groups participate in contracts either as a _provider_ or as a _consumer_.  Each endpoint group can participate in many contracts at the same time, but for each contract it can be in only one role at a time.  In addition, there are two ways for an endpoint group to select a contract: either with a _named selector_ or with a _target selector_.  Named selectors simply select a specific contract by its contract ID.  Target selectors allow for additional flexibility by matching against _qualities_ of the contract's _target_.
+
+Thus, there are a total of 4 kinds of contract selector:
+
+provider named selector::
+Select a contract by contract ID, and participate as a provider.
+provider target selector::
+Match against a contract's target with a quality matcher, and participate as a provider.
+consumer named selector::
+Select a contract by contract ID, and participate as a consumer.
+consumer target selector::
+Match against a contract's target with a quality matcher, and participate as a consumer.
+
+So to determine which contracts are in scope for our flow, we must find contracts where either the source endpoint group selects a contract as either a provider or consumer, while the destination endpoint group matches against the same contract in the corresponding role.  So if endpoint _x_ in endpoint group _X_ is communicating with endpoint _y_ in endpoint group _Y_, a contract _C_ is in scope if either _X_ selects _C_ as a provider and _Y_ selects _C_ as a consumer, or _X_ selects _C_ as a consumer and _Y_ selects _C_ as a provider.
+
+The details of how quality matchers work are described further below in <<matchers,Matchers>>.  For now, we can simply state that quality matchers provide a flexible mechanism for selecting the contract based on labels.
+
+The end result of the contract selection phase can be thought of as a set of tuples representing selected contract scopes.  The fields of the tuple are:
+
+* Contract ID
+* The provider endpoint group ID
+* The name of the selector in the provider endpoint group that was used to select the contract, which we'll call the _matching provider selector_.
+* The consumer endpoint group ID
+* The name of the selector in the consumer endpoint group that was used to select the contract, which we'll call the _matching consumer selector_.
+
+===== Subject Selection
+
+The second phase in policy resolution is to determine which subjects are in scope.  The subjects allow us to define what kinds of communication are allowed between endpoints in the endpoint groups.  For each of the selected contract scopes from the contract selection phase, we'll need to apply the subject selection procedure.
+
+Before we can discuss how the subjects are matched, we need to first examine what we match against to bring those subjects into scope.  We match against labels called, capabilities, requirements and conditions.  Endpoint groups have capabilities and requirements, while endpoints have conditions.
+
+====== Requirements and Capabilities
+
+When acting as a provider, endpoint groups expose _capabilities_, which are labels representing specific pieces of functionality that can be exposed to other endpoint groups that may meet functional requirements of those endpoint groups.  When acting as a consumer, endpoint groups expose _requirements_, which are labels that represent that fact that the endpoint group requires some specific piece of functionality.    As an example, we might create a capability called "user-database" which indicates that an endpoint group contains endpoints that implement a database of users.  We might create a requirement also called "user-database" to indicate an endpoint group contains endpoints that will need to communicate with the endpoints that expose this service.  Note that in this example the requirement and capability have the same name, but the user need not follow this convention.
+
+We examine the matching provider selector (that was used by the provider endpoint group to select the contract) to determine the capabilities exposed by the provider endpoint group for this contract scope.  The provider selector will have a list of capabilities either directly included in the provider selector or inherited from a parent selector or parent endpoint group (See <<inheritance,Inheritance>> below).  Similarly, the matching consumer selector will expose a set of requirements.
+
+====== Conditions
+
+Endpoints can have _conditions_, which are labels representing some relevant piece of operational state related to the endpoint.  An example of a condition might be "malware-detected," or "authentication-succeeded."  We'll be able to use these conditions to affect how that particular endpoint can communicate.  To continue with our example, the "malware-detected" condition might cause an endpoint's connectivity to be cut off, while "authentication-succeeded" might open up communication with services that require an endpoint to be first authenticated and then forward its authentication credentials.
+
+Conditions do not actually appear in the policy configuration model other than as a named reference.  To determine the set of conditions that apply to a particular endpoint, the endpoint will need to be looked up in the endpoint registry, and it associated condition labels retrieved from there.
+
+====== Clauses
+
+Clauses are what will do the actual selection of subjects.  A clause has four lists of matchers in two categories.  In order for a clause to become active, all four lists of matchers must match.  A matching clause will select all the subjects referenced by the clause.  Note that an empty list of matchers counts as a match.
+
+The first category is the consumer matchers, which match against the consumer endpoint group and endpoints.  The consumer matchers are:
+
+Requirement matchers::
+matches against requirements in the matching consumer selector.
+Consumer condition matchers::
+matches against conditions on endpoints in the consumer endpoint group
+
+The second category is the provider matchers, which match against the provider endpoint group and endpoints.  The provider matchers are:
+
+Capability matchers::
+matches against capability in the matching provider selector.
+Provider condition matchers::
+matches against conditions on endpoints in the provider endpoint group
+
+Clauses have a list of subjects that apply when all the matchers in the clause match.  The output of the subject selection phase logically is a set of subjects that are in scope for any particular pair of endpoints.
+
+[[rule_application]]
+===== Rule Application
+
+Now that we have a list of subjects that apply to the traffic between a particular set of endpoints, we're ready to describe how we actually apply policy to allow those endpoints to communicate.  The applicable subjects from the previous step will each contain a set of rules.
+
+Rules consist of a set of _classifiers_ and a set of _actions_.  Classifiers match against traffic between two endpoints.  An example of a classifier would be something that matches against all TCP traffic on port 80, or one that matches against HTTP traffic containing a particular cookie.  Actions are specific actions that need to be taken on the traffic before it reaches its destination.  Actions could include tagging or encapsulating the traffic in some way, redirecting the traffic, or applying some service chain.  For more information on how classifiers and actions are defined, see below under <<subject_features,Subject Features>>.
+
+If and only if _all_ classifiers on a rule matches, _all_ the actions on that rule are applied (in order) to the traffic.  Only the first matching rule will apply.
+
+Rules, subjects, and actions have an _order_ parameter, where a lower order value means that a particular item will be applied first.  All rules from a particular subject will be applied before the rules of any other subject, and all actions from a particular rule will be applied before the actions from another rule.  If more than item has the same order parameter, ties are broken with a lexicographic ordering of their names, with earlier names having logically lower order.
+
+We've now reached final phase in the three-phases policy resolution process.  First, we found the set of contract scopes to apply.  Second, we found the set of subjects to apply.  Finally, we saw how we apply the subjects to traffic between pairs of endpoints in order to realize the policy.  The remaining sections will fill in additional detail for the policy resolution process.
+
+[[matchers]]
+==== Matchers
+
+Matchers have been mentioned a few times now without really explaining what they are.  Matchers specify a set of labels (which include requirements, capabilities, conditions, and qualities) to match against.  There are several kinds of matchers that operate similarly:
+
+* Quality matchers are used in target selectors during the contract selection phase.  Quality matchers provide a more advanced and flexible way to select contracts compared to a named selector.
+* Requirement matchers and capability matchers are used in clauses during the subject selection phase to match against requirements and capabilities on endpoint groups
+* Condition matchers are used in clauses during the subject selection phase to match against conditions on endpoints
+
+A matcher is, at its heart, fairly simple.  It will contain a list of label names, along with a _match type_.  The match type can be either "all," which means the matcher matches when all of its labels match, "any," which means the matcher matches when any of its labels match, or "none," which means the matcher matches when none of its labels match.  Note that a matcher which always matches can be made by matching against an empty set of labels with a match type of "all."
+
+Additionally each label to match can optionally include a relevant "name" field.  For quality matchers, this is a target name.  For capability and requirement matchers, this is a selector name.  If the name field is specified, then the matcher will only match against targets or selectors with that name, rather than any targets or selectors.
+
+There are some additional semantics related to inheritance.  Please see the section for <<inheritance,Inheritance>> for more details.
+
+===== Quality Matchers
+
+A contract contains _targets_ that are just a set of quality labels.  A target selector on an endpoint group matches against these targets using quality matchers.  A quality matcher is a matcher where the label it matches is a quality, and the name field is a target name.
+
+===== Requirement and Capability Matchers
+
+The matching selector from the contract selection phase will define either requirements or capabilities for the consumer and provider endpoint groups, respectively.  Clauses can match against these labels using requirement and capability matchers.  Requirements matchers match against requirements while capability matchers match against capabilities.  In both cases, the name field is a selector.
+
+===== Condition Matcher
+
+Endpoints can have condition labels.  The condition matcher can be used in a clause to match against endpoints with particular combinations of conditions.
+
+==== Tenants
+The system allows multiple tenants that are designed to allow separate domains of administration.  Contracts and endpoint groups are defined within the context of a particular tenant.  Endpoints that belong to endpoint groups in separate tenants cannot communicate with each other except through a special mechanism to allow cross-tenant contracts called _contract references_.
+
+While it would be be possible to define semantics for tenant inheritance, as currently defined there is no way for tenants to inherit from each other.  There is, however, a limited mechanism through the special _common tenant_ (see <<common_tenant,Common Tenant>> below).  All references to names are within the scope of that particular tenant, with the limited exceptions of the common tenant and contract references.
+
+===== Contract References
+Contract references are the mechanism by which endpoints in different tenants can communicate.  This is especially useful for such common use cases as gateway routers or other shared services.  In order to for an endpoint group to select a contract in a different tenant, there must first exist a contract reference defined in the endpoint group's local tenant.  The contract reference is just a tenant ID and a contract ID; this will bring that remote contract into the scope of the local contract.  Note that this reference may be subject to additional access control mechanisms.
+
+Endpoint groups can participate in such remotely-defined contracts only as consumers, not as providers.
+
+Once the contract reference exists, endpoint groups can now select that contract using either named or target selectors.  By defining a contract reference, the qualities and targets in that contract are imported into the namespace of the local tenant for the contract selection phase.  Similarly, the requirements and conditions from the local tenant will be used when performing the consumer matches in the subject selection phase.
+
+[[common_tenant]]
+===== Common Tenant
+
+The common tenant is an area where definitions that are useful for all tenants can be created.  Everything defined in the common tenant behaves exactly as though it were defined individually in every tenant.  This applies to resolution of labels for the purposes of contract selection, as well as subject feature instances (see <<subject_features,Subject Features>> below).
+
+If a name exists in both the common tenant and another tenant, then when resolving names within the context of that tenant the definition in the common tenant will be masked.  One special case to consider is if a definition in a tenant defines the common tenant definition as its parent and uses the same name as the parent object.  This works as you might expect: the name reference from the child definition will extend the behavior of the definition in the common tenant, but then mask the common tenant definition so that references to the name within the tenant will refer to the extended object.
+
+[[subject_features]]
+==== Subject Features
+
+Subject features are objects that can be used as a part of a subject to affect the communication between pairs of endpoints.  This is where the policy model meets the underlying infrastructure.  Because different networks will have different sets of features, we need some way to represent to the users of the policy what is possible.  Subject features are the answer to this.
+
+There are two kinds of subject features: classifiers and actions.  Classifiers match on traffic between endpoints, and actions perform some operation on that traffic (See <<rule_application,Rule Application>> above for more information on how they are used).
+
+Subject features are defined with a subject feature definition.  The definition defines a name and description for the feature, along with a set of parameters that the item can take.  This is the most general description for the subject feature, and this definition is global and applies across all tenants.  As an example, a classifier definition might be called "tcp_port", and would take an integer parameter "port".
+
+Next, there are subject feature instances.  Subject feature instances are scoped to a particular tenant, and reference a subject feature definition, but fill in all required parameters.  To continue our example, we might define a classifier instance called "http" that references the "tcp_port" classifier and species the parameter "port" as 80.
+
+Finally, there are subject feature references, which are references to subject feature instances.  Subjects contain these subject feature references in order to apply the feature.  These references also contain, as appropriate an order field to determine order of operations and fields for matching the direction of the traffic.
+
+If the underlying network infrastructure is unable to to implement a particular subject, then it must raise an exception related to that subject.  It may then attempt to relax the constraints in a way that allows it to implement the policy.  However, when doing this it must attempt to avoid allowing traffic that should not be allowed.  That is, it should "fail closed" when relaxing constraints.
+
+==== Forwarding Model
+
+Communication between endpoint groups can happen at layer 2 or layer 3, depending on the policy configuration.  We define our model of the forwarding behavior in a way that supports very flexible semantics including overlapping layer 2 and layer 3 addresses.
+
+We define several kinds of _network domains_, which represent some logical grouping or namespace of network addresses:
+
+L3 Context::
+A layer 3 context represents a namespace for layer 3 addresses.  It represents a domain inside which endpoints can communicate without requiring any address translation.  A subtype of a forwarding context, which is a subtype of a network domain.
+L2 Bridge Domain::
+A layer 2 bridge domain represents a domain in which layer 2 communication is possible when allowed by policy.  Bridge domains each have a single parent L3 context. A subtype of an L2 domain, which is a subtype of a forwarding context.
+L2 Flood Domain::
+A layer 2 flood domain represents a domain in which layer 2 broadcast and multicast is allowed.  L2 flood domains each have a single parent L2 bridge domain.  A subtype of an L2 domain.
+Subnet::
+An IP subnet associated with a layer 2 or layer 3 context.  Each subnet has a single parent forwarding context.  A subtype of a network domain.
+
+Every endpoint group references a single network domain.
+
+[[inheritance]]
+==== Inheritance
+
+This section contains information on how inheritance works for various objects in the system.  This is covered here to avoid cluttering the main sections with a lot of details that would make it harder to see the big picture.
+
+Some objects in the system include references to parents, from which they will inherit definitions.  The graph of parent references must be loop free.  When resolving names, the resolution system must detect loops and raise an exception.  Objects that are part of these loops may be considered as though they are not defined at all.
+
+Generally, inheritance works by simply importing the objects in the parent into the child object.  When there are objects with the same name in the child object, then the child object will override the parent object according to rules which are specific to the type of object.  We'll next explore the detailed rules for inheritance for each type of object.
+
+===== Endpoint Groups
+
+Endpoint groups will inherit all their selectors from their parent endpoint groups.  Selectors with the same names as selectors in the parent endpoint groups will inherit their behavior as defined below.
+
+====== Selectors
+
+Selectors include provider named selectors, provider target selectors, consumer named selectors, and consumer target selectors.  Selectors cannot themselves have parent selectors, but when selectors have the same name as a selector of the same type in the parent endpoint group, then they will inherit from and override the behavior of the selector in the parent endpoint group.
+
+[red]*Named Selectors*
+
+Named selectors will add to the set of contract IDs that are selected by the parent named selector.
+
+[red]*Target Selectors*
+
+A target selector in the child endpoint group with the same name as a target selector in the parent endpoint group will inherit quality matchers from the parent.  If a quality matcher in the child has the same name as a quality matcher in the parent, then it will inherit as described below under Matchers.
+
+===== Contracts
+
+Contracts will inherit all their targets, clauses and subjects from their parent contracts.  When any of these objects have the same name as in the parent contract, then the behavior will be as defined below.
+
+====== Targets
+
+Targets cannot themselves have a parent target, but it may inherit from targets with the same name as the target in a parent contract.  Qualities in the target will be inherited from the parent.  If a quality with the same name is defined in the child, then this does not have any semantic effect except if the quality has its inclusion-rule parameter set to "exclude."  In this case, then the label should be ignored for the purpose of matching against this target.
+
+====== Subjects
+
+Subjects cannot themselves have a parent subject, but it may inherit from a subject with the same name as the subject in a parent contract.
+
+The order parameter in the child subject, if present, will override the order parameter in the parent subject.
+
+The rules in the parent subject will be added to the rules in the child subject.  However, the rules will _not_ override rules of the same name.  Instead, all rules in the parent subject will be considered to run with a higher order than all rules in the child; that is all rules in the child will run before any rules in the parent.  This has the effect of overriding any rules in the parent without the potentially-problematic semantics of merging the ordering.
+
+====== Clauses
+
+Clauses cannot themselves have a parent clause, but it may inherit from a clause with the same name as the clause in a parent contract.
+
+The list of subject references in the parent clause will be added to the list of subject references in the child clause.  There is no meaningful overriding possible here; it's just a union operation.  Note of course though that a subject reference that refers to a subject name in the parent contract might have that name overridden in the child contract.
+
+Each of the matchers in the clause are also inherited by the child clause.  Matchers in the child of the same name and type as a matcher from the parent will inherit from and override the parent matcher.  See below under <<inheritance_matchers,Matchers>> for more information.
+
+[[inheritance_matchers]]
+===== Matchers
+
+Matchers include quality matchers, condition matchers, requirement matchers, and capability matchers.  Matchers cannot themselves have parent matchers, but when there is a matcher of the same name and type in the parent object, then the matcher in the child object will inherit and override the behavior of the matcher in the parent object.
+
+The match type, if specified in the child, overrides the value specified in the parent.
+
+Labels are also inherited from the parent object.  If there is a label with the same name in the child object, this does not have any semantic effect except if the label has its inclusion-rule parameter set to "exclude."  In this case, then the label should be ignored for the purpose of matching.  Otherwise, the label with the same name will completely override the label from the parent.
+
+===== Subject Feature Definitions
+
+Subject features definitions, including classifier definitions and subject definitions can also inherit from each other by specifying a parent object.  These are a bit different from the other forms of override because they do not merely affect the policy resolution process, but rather affect how the policy is applied in the underlying infrastructure.
+
+For the purposes of policy resolution, a subject feature will inherit from its parent any named parameters.  However, unlike in other cases, if a named parameter with the same name exists in the child as in the parent, this is an invalid parameter and will be ignored in the child.  That is, the child _cannot_ override the type of a named parameter in a child subject feature.
+
+For the purposes of applying the subject in the underlying infrastructure, the child subject feature is assumed to add some additional functionality to the parent subject feature such that the child feature is a specialization of that parent feature.  For example, there might be a classifier definition for matching against a TCP port, and a child classifier definition that allowed for deep packet inspection for a particular protocol that extended the base classifier definition.  In this case, the child classifier would be expected to match the TCP port as well as apply its additional deep packet inspection semantics.
+
+If the underlying infrastructure is unable to apply a particular subject feature, it can attempt to fall back to implementing instead the parent subject feature.  The parameter fallback-behavior determines how this should apply.  If this is set to "strict" then a failure to apply the child is a fatal error and the entire subject must be ignored.  If the fallback behavior is "allow-fallback" then the error is nonfatal and it is allowed to apply instead only the parent subject feature.
+
+=== State Repositories
+
+The state repositories are distributed data stores that provide the configuration and operational data required for renderers to apply the policy as specified by the user.  The state repositories all model their state as yang models, and store that state in the MD-SAL data store as either operational or configuration data, as appropriate.  The state repositories implement a minimum amount of actual functionality and instead focus on defining the models and supporting the correct querying and subscription semantics.  The intelligence is expected to be in the renderers.
+
+==== Querying and Subscription
+
+State repositories support both simple queries on the data but more important allow subscriptions to the data, so that systems that are responsible for applying the policy model are informed about changes to that policy configuration or operational state that might affect the policy.  Those subsystems are expected to continuously reevaluate the policy as these changes come in make the required changes in the underlying infrastructure.
+
+[[endpoint_repository]]
+==== Endpoint Repository
+
+The endpoint repository is responsible for storing metadata about endpoints, including how they are mapped into endpoint groups.  Information about endpoints can be added to the repository either by a central orchestration system or by a renderer that performs discovery to learn about new endpoints.  In either case, the semantics of how an endpoint is mapped to an endpoint group are not defined here; the system that sets up the information in the endpoint repository must have its own method for assigning endpoints to endpoint groups.
+
+==== Policy Repository
+
+The policy repository stores the policies themselves. This includes endpoint groups, selectors, contracts, clauses, subjects, rules, classifiers, actions, and network domains (everything in the policy model except endpoints and endpoint-related metadata). The policy repository is populated through the northbound APIs.
+
+==== Status Repository
+
+The status repository will be added in a future release of group-based policy.
+
+=== Renderers
+
+One of the key design features of the group-based policy architecture is that it can support a variety of renderers based on very different underlying technology.  This is possible because the policy model is based only on high-level user intent, and contains no information about the details of how the network traffic is actually forwarded.  However, one consequence of this design choice is that the renderers actually contain most of the complexity in the design of the system, and most of the real problems in building a software-defined virtual network solution will need to be solved by the renderers themselves.
+
+==== Renderer Common Framework
+
+The renderers have available to them some service and libraries that collectively make up the _renderer common framework_.  These are not actually required to implement a renderer, but where convenient functionality that would be generally useful should be placed here.
+
+===== `InheritanceUtils`
+
+This class provides a convenient utility to resolve all the complex inheritance rules into a normalized view of the policy for a tenant.
+
+[source,java]
+----
+  /**
+   * Fully resolve the specified {@link Tenant}, returning a tenant with all
+   * items fully normalized.  This means that no items will have parent/child
+   * relationships and can be interpreted simply without regard to inheritance
+   * rules
+   * @param tenantId the {@link TenantId} of the {@link Tenant}
+   * @param transaction a {@link DataModificationTransaction} to use for
+   * reading the data from the policy store
+   * @return the fully-resolved {@link Tenant}
+   */
+  public static Tenant resolveTenant(TenantId tenantId,
+                                     DataModificationTransaction transaction)
+----
+
+===== `PolicyResolverService`
+
+The policy resolver service resolves the policy model into a representation suitable for rendering to an underlying network.  It will run through the contract resolution and
+
+The policy resolver is a utility for renderers to help in resolving group-based policy into a form that is easier to apply to the actual network.
+
+For any pair of endpoint groups, there is a set of rules that could apply to the endpoints on that group based on the policy configuration.  The exact list of rules that apply to a given pair of endpoints depends on the conditions that are active on the endpoints.
+
+In a more formal sense: Let there be endpoint groups _G~n~_, and for each _G~n~_ a set of conditions _C~n~_ that can apply to endpoints in _G~n~_.  Further, let _S_ be  the set of lists of rules defined in the policy.  Our policy can be represented as a function _F_: (_G~n~_, 2 _^C~n~^_, _G~m~_, 2 _^C~m~^_) \-> _S_, where 2 _^C~n~^_ represents the power set of _C~m~_. In other words, we want to map all the possible tuples of pairs of endpoints along with their active conditions onto the right list of rules to apply.
+
+We need to be able to query against this policy model, enumerate the relevant classes of traffic and endpoints, and notify renderers when there are changes to policy as it applies to active sets of endpoints and endpoint groups.
+
+The policy resolver will maintain the necessary state for all tenants in its control domain, which is the set of tenants for which  policy listeners have been registered.
+
+[[ovs_overlay]]
+==== Open vSwitch-Based Overlay Renderers
+
+This section describes a data plane architecture for building a network virtualization solution using Open vSwitch.  This data plane design is used by two renderers: the <<openflow_renderer,OpenFlow Renderer>> and the <<opflex_renderer,OpFlex Renderer>>.
+
+The design implements an overlay design and is intended to meet the following use cases:
+
+* Routing when required between endpoint groups, including serving as a distributed default gateway.
+* Optional broadcast within a bridge domain.
+* Management of L2 broadcast protocols including ARP and DHCP to avoid broadcasting.
+* Layer 2-4 classifiers for policy between endpoint groups, including connection tracking/reflexive ACLs.
+* Service insertion/redirection
+
+===== Network Architecture
+
+====== Network Topology
+
+The network architecture is an overlay network based on VXLAN or similar encapsulation technology, with an underlying IP network that provides connectivity between hypervisors and the controller.  The overlay network is a full-mesh set of tunnels that connect each pair of vSwitches.
+
+The "underlay" IP network has no special requirements though it should be set up with ECMP to the top-of-rack switch for the best performance, but this is not a strict requirement for correct behavior.  Also, the underlay network should be configured with a path MTU that's large enough to accommodate the overlay tunnel headers.  For a typical overlay network with a 1500 byte MTU, a 1600 byte MTU in the underlay network should be sufficient.  If this is not configured correctly, the behavior will be correct but it will result in fragmentation which could have a severe negative effect on performance.
+
+Physical devices such as routers on the IP network are trusted entities in the system since these devices would have the ability to forge encapsulated packets.
+
+[[network_topology_example]]
+.GBP OVS Network Topology Example
+
+image::gbp_overlay_design_red_tunnel.png[width="80%"]
+
+The <<network_topology_example,Network Topology Example>> figure shows an example of a supported network topology, with an underlying IP network and hypervisors with Open vSwitch.  Infrastructure components and elements of the underlay network are shown in grey.  Three endpoint groups exist with different subnets in the same layer 3 context, which are show in red, green, and blue.  A tunneled path (dotted red line) is shown between two red virtual machines on different VM hosts.
+
+====== Control Network
+
+The security of the system depends on keeping a logically isolated control network separate from the data network, so that guests cannot reach the control network.  Ideally, the network is kept isolated through an out-of-band control network.  This can be accomplished using a separate NIC, a special VLAN, or other mechanism.  However, the system is also designed to operate in the case where the control traffic and the data traffic are on the same layer 2 network and isolation is still enforced.
+
+In the <<network_topology_example,Network Topology Example>> figure above, the control network is shown as 172.16/16.  The VM hosts, and controllers all have addresses on this network, and communicate using OpenFlow and OVSDB on this network.  In the example, the router is shown with an interface configured on this network as well; this works but in practice it is preferable to isolate this network by accessing it through a VPN or jump box if needed.  Note that there is no requirement that the control network be all in one subnet.
+
+The router is also shown with an interface configured on the 10/8 network.  This network will be used for routing traffic destined for internet hosts.  Both the 172.16/16 and 10/8 networks here are isolated from the guest address spaces.
+
+====== Overlay Network
+
+Whenever traffic between two guests is in the network, it will be encapsulated using a VXLAN tunnel (though supporting additional encapsulation formats could be configured in the future).  A packet encapsulated as VXLAN contains:
+
+* Outer ethernet header, with source and destination MAC
+* Outer IP header, with source and destination IP address
+* Outer UDP header
+* VXLAN header, with a virtual network identifier (VNI).  The virtual network identifier is a 24-bit field that uniquely identifies an endpoint group in our policy model.
+* Encapsulated original packet, which includes:
+** Inner ethernet header, with source and destination MAC
+** (Optional) Inner IP header, with source and destination IP address
+
+====== Delivering Packets
+
+Endpoints can communicate with each other in a number of different ways, and each is processed slightly differently.  Endpoint groups exist inside a particular layer 2 or layer 3 context which represents a namespace for their network identifiers.  It's only possible for endpoints to address endpoints within the same context, so no communication is possible for endpoints in different layer 3 contexts, and only layer 3 communication is possible for endpoints in different layer 2 contexts.
+
+[red]*Overlay Tunnels*
+
+The next key piece of information is the location of the destination endpoint.  For destinations on the same switch, we can simply apply policy (see below), perform any routing action required (see below), then deliver it to the local port.
+
+When the endpoints are located on different switches, we need to use the overlay tunnel.  This is the case shown as a dotted red line in the <<network_topology_example,Network Topology Example>> figure.  After policy is applied to the packet, we encapsulated it in a tunnel with the tunnel ID set to a unique ID for the destination endpoint group.  The outer packet is addressed to the IP address of the OVS host that hosts the destination endpoint.  This encapsulated packet is now sent out to the underlay network, which is just a regular IP network that can deliver the packet to the destination switch.
+
+When the encapsulated packet arrives on the other side, the destination vSwitch inspects the metadata of the encapsulation header to see if the policy has been applied already. If the policy has not been applied or if the encapsulation protocol does not support carrying of metadata, the policy must be applied at the destination vSwitch. The packet can now be delivered to the destination endpoint.
+
+[red]*Bridging and Routing*
+
+The system will transparently handle bridging or routing as required.  Bridging occurs between endpoints in the same layer 2 context, while routing will generally be needed for endpoints in different layer 2 contexts.  More specifically, a packet needs to be routed if it is addressed to the gateway MAC address.  We can simply use a fixed MAC address to serve as the gateway everywhere.  Packets addressed to any other MAC address can be bridged.
+
+Bridged packets are easy to handle, since we don't need to do anything special to them to deliver them to the destination.  They can be simply delivered unmodified.
+
+Routing is slightly more complex, though not massively so.  When routing locally on a switch, we simply rewrite the destination MAC address to the MAC of the destination endpoint, and set the source MAC to the gateway MAC, decrement the TTL, and then deliver it to the correct local port.
+
+When routing to an endpoint on a different switch, we'll actually perform routing in two steps.  On the source switch, we will decrement TTL and rewrite the source MAC address to the MAC of the gateway router (so that both the source and the destination MAC are set to the gateway router's MAC).  It's then delivered to the destination switch using the appropriate tunnel.  On the destination switch, we perform a second routing action by now rewriting the destination MAC as the MAC address of the destination endpoint and decrementing the TTL again.  The reason why do the routing as two hops is that this avoids the need to maintain on every switch the correct MAC address for every endpoint on the network.  Each switch needs the mappings only for endpoints that are directly attached to that switch.  An example of a communication pathway requiring this routing is shown in the figure below.
+
+.GBP OVS Routing Example
+
+image::gbp_overlay_design_blue_to_red_tunnel.png[width="80%"]
+
+In this example, we show the path of traffic from the blue guest 192.168.2.3 and the red guest 192.168.1.2.  The traffic is encapsulated in a tunnel marked with the blue endpoint group's VNI while in transit between the two switches.  Because two endpoints are in different subnets, the traffic is routed in two hops: one the source switch and one on the destination switch.
+
+The vSwitch on each host must respond to local ARP requests for the gateway IP address and return a logical MAC address representing the L3 gateway.
+
+[red]*Communicating with Outside Hosts*
+
+Everything up until now is quite simple, but it's possible to conceive of situations where endpoints in our network need to communicate over the internet or with other endpoints outside the overlay network.  There are two broad approaches for handling this.  In both cases, we allow such access only via layer 3 communication.
+
+First, we can map physical interfaces on an OVS system into the overlay network.  If a router interface is attached either directly to a physical interface or indirectly via an isolated network, then the router interface can be easily exposed as an endpoint in the network.  Endpoints can then communicate with this router interface (perhaps after some intermediate routing via the distributed routing scheme described above) and from there get to the rest of the world.  Dedicated OVS systems can be thus configured as gateway devices into the overlay network which will then be needed for any of this north/south communication.  This has the advantage of being very conceptually simple but requires special effort to load balance the traffic effectively.
+
+Second, we can use a DNAT scheme to allow access to endpoints that are reachable via the underlay network.  In this scheme, for every endpoint that is allowed to communicate to these outside hosts, we allocate an IP address from a dedicated set of subnets on the underlay (each network segment in the underlay network will require a separate DNAT range for switches attached to that subnet).  We can perform the DNAT translation on the OVS switch and then simply deliver the traffic to the underlay network to deliver to the internet host or other host, and perform the reverse translation to get back into the overlay network.
+
+.GBP OVS Example of Communication With Outside Endpoints
+
+image::gbp_overlay_design_red_to_outside.png[width="80%"]
+
+An example of communication with outside endpoints using the DNAT scheme is shown in the figure above.  In this example, the red endpoint is communicating with an endpoint on the internet through a gateway router.  The traffic goes through a DNAT translation to an IP allocated to the endpoint for this purpose.  The IP communication can then be delivered through the IP underlay network.
+
+For the first implementation, we'll stick with the DNAT scheme and consider implementing the gateway-based or other solution.
+
+===== Packet Processing Pipeline
+
+.GBP OVS Packet Processing Pipeline
+
+image::gbp_ovs_pipeline.png[width="65%"]
+
+Here is a simplified high-level view of what happens to packets in this network when it hits an OVS instance:
+
+. If data and management network are shared, determine whether packet is targeted for the host system. If so, reinject into host networking stack.
+. Apply port security rules if enabled on the port to determine if the source identifiers (MAC and IP) are allowed on the port
+ * For packets received from the overlay: Determine the source endpoint group (sEPG) based on the tunnel ID from the outer packet header.
+ * For packets received from local ports: Determine sEPG based on source port and source identifiers as configured.
+ * As an sEPG can only be associated with a single L2 and L3 context, the context is determined in this step as well.
+ * Unknown source identifiers may result in a packet-in if the network is doing learning.
+. Handle broadcast and multicast packets while respecting broadcast domains.
+. Catch any special packet types that are handled specially.  This could include ARP, DHCP, or LLDP.  How these are handled may depend on the specific renderer implementation.
+. Determine whether the packet will be bridged or routed. If the destination MAC address is the default gateway MAC, then the packet will be routed, otherwise it will be bridged.
+. Determine the destination endpoint group (dEPG) and outgoing port or next hop while respecting the L2/L3 context.
+ * For bridged packets (L2): Determine based on the destination MAC address.
+ * For routed packets (L3): Determine based on the destination IP address.
+. Apply the appropriate set of policy rules based on the active subjects for that flow.  We can bypass this step if the tunnel metadata indicates hat the policy has been applied at the source.
+. Apply a routing action if needed by modifying the destination and source MAC and decrementing the TTL.
+ * For local destination: Rewrite the destination MAC to the MAC address for the connected endpoint, source MAC to the MAC of the default gateway.
+ * For remote destinations: Rewrite the destination MAC to the MAC of the next hop, source MAC to the MAC of the default gateway.
+. If the next hop is a local port, then it is delivered as-is.  If the next hop is not local, then the packet is encapsulated and the tunnel ID is set to the network identifier for the source endpoint group (sEPG).  If the packet is a layer 2 broadcast packet, then it will need to be written to the correct set of ports, which might be a combination of local and multiple remote tunnel endpoints.
+
+====== Register Usage
+
+The processing pipeline needs to store metadata such as the sEPG, dEPG, and broadcast domain. This metadata can be stored in any way supported by the switch. OpenFlow provides a dedicated 64 bit metadata field, Open vSwitch additionally provides multiple 32 bit registers in form of Nicira Extensions. The following examples will use Nicira extensions for simplicity. The choice of register usage is an implementation detail of the renderer.
+
+*Option 1: Register allocation using Nicira Extensions*
+
+[cols="1m,4",options="header"]
+|====
+|Register|Value
+|NXM_NX_REG1 |Source Endpoint Group (sEPG) ID
+|NXM_NX_REG2 |L2 context (BD)
+|NXM_NX_REG3 |Destination Endpoint Group (dEPG) ID
+|NXM_NX_REG4 |Port number to send packet to after policy enforcement. This is required because port selection occurs before policy enforcement in the pipeline.
+|NXM_NX_REG5 |L3 context ID (VRF)
+|====
+
+*Option 2: Register allocation using OpenFlow metadata*
+
+OpenFlow offers a single 64 bit register which can be used to store sEPG, dEPG, and BD throughout the lookup process alternatively. The advantage over using Nicira extensions is better portability and offload capability to hardware.
+
+[cols="1,4",options="header"]
+|====
+|Register|Value
+|metadata[0..15] |Source Endpoint Group (sEPG) ID
+|metadata[16..31] |Destination Endpoint Group (dEPG) ID
+|metadata[32..39] |L2 context (BD)
+|metadata[40..47] |L3 context (VRF)
+|metadata[48..63] |Port number to send packet to after policy enforcement. This is required because port selection occurs before policy enforcement in the pipeline.
+|====
+
+====== Table/Pipeline Names and Order
+
+In order to increase readability, the following table names are used in the following sections. Their order in the pipeline is as follows:
+
+[cols="1,3,3,5,4",options="header"]
+|=======================================
+|Table|ID|Description|Flow Hit|Flow Miss
+|1|+PORT_SECURITY+|Optional port security table|Proceed to +SEPG_FILTER+|Drop
+|2|+SEPG_FILTER+|sEPG selection|Remember sEPG, BD, and VRF. Then proceed to +DEPG_FILTER+|Trigger policy resolution (send to controller)
+|3|+DPEG_FILTER+|dEPG selection|Remember dEPG and output coordinates, proceed to +POLICY_ENFORCER+|Trigger policy resolution (send to controller)
+|4|+POLICY_ENFORCER+|Policy enforcement|Forward packet|Drop
+|=======================================
+
+OpenFlow >=1.1 capable switches can implement the flow miss policy for each table directly. Pure OpenFlow 1.0 switches will need to have a catch-all flow inserted to enforce the specified policy.
+
+====== Port Security
+
+An optional port security table can be inserted at the very beginning of the pipeline. It enforces a list of valid sMAC and sIP addresses for a specific port.
+
+----
+priority=30, in_port=TUNNEL_PORT, actions=goto_table:SEPG_FILTER
+priority=30, in_port=PORT1, dl_src=MAC1, action=goto_table:SEPG_FILTER
+priority=30, in_port=PORT2, dl_src=MAC2, ip, nw_src=IP2, actions=goto_table:SEPG_FILTER
+priority=20, in_port=PORT2, dl_src=MAC2, ip, actions=drop
+priority=10, in_port=PORT2, dl_src=MAC2, actions=goto_table:SEPG_FILTER
+priority=30, in_port=PORT3, actions=goto_table:SEPG_FILTER
+----
+
+The port-security flow-miss policy is set to drop in order for packets received on an unknown port or with an unknown sMAC/sIP to be rejected.
+
+The following modes of enforcement are defined:
+
+. Whitelisted: The port is allowed to use any addresses. All tunnel ports must be whitelisted. The filter is enforced with a single flow matching on in_port and redirects to the next table.
+. L2 enforcement: Any packet from the port must use a specific sMAC. The filter is enforced with a single flow matching on the in_port and dl_src and redirects to the next table.
+. L3 enforcement: Same as L2 enforcement. Additionally, any IP packet from the port must use a specific sIP. The filter is enforced with three flows with different priority.
+.. Any IP packet with correct sMAC and sIP is redirected to the next table.
+.. Any IP packet left over is dropped.
+.. Any non-IP packet with correct sMAC is redirected to the next table.
+
+====== Source EPG & L2/L3 Domain Selection
+
+The sEPG is determined based on a separate flow table which maps known OpenFlow port numbers and tunnel identifiers to a locally unique sEPG ID. The sEPG ID is stored in register NXM_NX_REG1 for later use in the pipeline. At the same time, the L2 and L3 context is determined and stored in register NXM_NX_REG2.
+
+[cols="1m,2",width="75%",options="header"]
+|====
+|Field|Description
+|table=SEPG_TABLE|Flow must be in sEPG selection table
+|in_port=$OFPORT|Flow must match on incoming port
+|tun_id=$VNI|If in_port is a tunnel, flow must match on tunnel ID
+|====
+
+The actions performed are:
+
+. Write sEPG ID corresponding to incoming port or tunnel ID to register
+. Write L2/L3 context ID corresponding to incoming port or tunnel ID to registers
+. Proceed to dEPG selection
+
+An example flow to map a local port to an sEPG:
+----
+table=SEPG_FILTER, in_port=$OFPORT
+actions=load:$SEPG->NXM_NX_REG1[],
+        load:$BD->NXM_NX_REG2[],
+        load:$VRF->NXM_NX_REG5[],
+        goto_table:$DEPG_FILTER
+----
+
+An example flow to map a tunnel ID to an sEPG:
+----
+table=SEPG_FILTER, in_port=TUNNEL_PORT, tun_id=$VNI1,
+actions=load:$SEPG1->NXM_NX_REG1[],
+        load:$BD->NXM_NX_REG2[],
+        load:$VRF->NXM_NX_REG5[],
+        goto_table:$DEPG_FILTER
+----
+
+A flow hit means that the sEPG is known and the pipeline should proceed to the next stage.
+
+A flow miss means that we have received a packet from an unknown EPG:
+
+. If the packet was received on a local port then this corresponds to the discovery of a new EP for which the Port to EPG mapping has not been populated yet. If the network is learned, generate a packet-in to trigger policy resolution, otherwise drop the packet.
+. If the packet was received from a tunnel then this corresponds to a packet for which we have not populated the tunnel ID to EGP mapping yet. If the network is learned, generate a packet-in to trigger policy resolution, otherwise drop the packet.
+
+====== Broadcasting / Multicasting
+
+Packets sent to the MAC broadcast address (+ff:ff:ff:ff:ff:ff+) must be flooded to all ports belonging to the broadcast domain. This is *not* equivalent to the OVS flood action as multiple broadcast domains reside on the same switch. The respective broadcast domains are modeled using OpenFlow group tables as follows:
+
+. Upon addition of a new broadcast domain to the local vSwitch:
+ * Create a new OpenFlow group table, using the BD ID as group ID
+
+   ovs-ofctl [...] add-group BRIDGE group_id=$BD, type=all
+
+ * Create a flow in the dEPG selection table matching on broadcast packets and correctly have them flooded to all group members:
+
+   priority=10, table=$DEPG_TABLE, reg2=$BD, dl_dst=ff:ff:ff:ff:ff:ff, actions=group:$BD
+
+. Upon addition/removal of a local port
+ * Modify group and add/remove output action to port to account for membership change:
+
+   osvs-ofctl [...] mod-group $BRIDGE [Old entry,] bucket=output:$PORT
+
+. Upon addition/removal of a non-local port to the BD
+ * Modify group and add/remove output + tunnel action to start/stop flooding packets over overlay
+
+====== Special Packet Types
+
+[red]*ARP Responder*
+
+In order for the distributed L3 gateway to be reachable, the vSwitch must respond to ARP requests sent to the default gateway address. For this purpose, a flow is added which translates ARP requests into ARP replies and sends them back out the incoming port.
+
+[cols="1m,2",width="75%",options="header"]
+|====
+|Field|Description
+|priority=20|Must have higher priority than regular, non-ARP dEPG table flows.
+|table=DEPG_FILTER|Flow must be in dEPG selection table
+|reg5=2|Must match a specific L3 context (+NXM_NX_REG5+)
+|arp, arp_op=1|Packet must be ARP request
+|arp_tpa=GW_IP|ARP request must be targeted for IP of gateway
+|====
+
+The actions performed are:
+
+. Set dMAC to original sMAC of packet to reverse direction
+. Set sMAC to MAC of gateway
+. Set ARP operation to (arp-reply)
+. Set target hardware address to original source hardware address
+. Set source hardware address to MAC of gateway
+. Set target protocol address to original source protocol address
+. Set source protocol address to IP of gateway
+. Transmit packet back out the incoming port
+
+----
+priority=20, table=DEPG_FILTER, reg5=$VRF,
+arp, arp_op=1, arp_tpa=$GW_ADDRESS,
+actions=move:NXM_OF_ETH_SRC[]->NXM_OF_ETH_DST[],
+        mod_dl_src:$GW_MAC,
+        load:2->NXM_OF_ARP_OP[],
+        move:NXM_NX_ARP_SHA[]->NXM_NX_ARP_THA[],
+        load:''Hex(''$GW_MAC'')''->NXM_NX_ARP_SHA[],
+        move:NXM_OF_ARP_SPA[]->NXM_OF_ARP_TPA[],
+        load:''Hex(''$GW_ADDRESS'')''->NXM_OF_ARP_SPA[],
+        in_port
+----
+
+[red]*ARP Optimization*
+
+.GBP OVS ARP Optimization
+
+image::gbp_ovs_arp_optimization.png[width="50%"]
+
+As the MAC / IP pairing of endpoints is known in the network. ARP requests can be optimized and translated into unicasts. While it is possible to have a local vSwitch become an ARP responder directly, the unicast translation offers a minimal aliveness check within the scope of the L2 context.
+
+A flow is inserted into the sEPG selection table as follows:
+----
+priority=10, arp, arp_op=1, dl_dst=ff:ff:ff:ff:ff:ff, actions=controller
+----
+
+As the ARP request is received, the packet is sent to the controller. The controller/agent resolves the MAC address to the IP address and inserts a new DNAT flow to translate subsequent ARP requests for the same transport address directly in the vSwitch:
+----
+ priority=15, table=DEPG_FILTER,
+ arp, arp_op=1, dl_dst=ff:ff:ff:ff:ff:ff,
+ actions=mod_dl_dst:$MAC,
+         load:${DEPG}->NXM_NX_REG3[],
+         load:${OFPORT}->NXM_NX_REG4[],
+         goto_table:$ENFORCER_TABLE
+----
+
+The +OFPORT+ is either a local port or the tunnel port. The latter case requires to additionally set the tunnel ID as described in previous sections.
+
+[NOTE]
+========
+The controller can proactively insert ARP optimization flows for local or even remote endpoints to avoid the one time controller round trip penalty.
+========
+
+The controller/agent then reinjects the original ARP request back into the network via a packet-out OpenFlow message.
+
+====== Destination EPG Selection (L2)
+
+The dEPG selection is performed after the sEPG has been determined. The mapping occurs in its own flow table which contains both L2 and L3 flow entries. This section explains L2 processing, L3 processing is described in the next section.
+
+The purpose of flow entries in this table is to map known destination MAC addresses in a specific L2 context to a dEPG and to prepare the action set for execution after policy enforcement.
+
+[cols="1m,2",width="70%",options="header"]
+|====
+|Field|Description
+|priority=10|Must have lower priority than L3 flows
+|table=DEPG_FILTER|Flow must be in dEPG selection table
+|reg2=2|Must match on L2 context (NXM_NX_REG2)
+|dl_dst=MAC|Packet must match on destination MAC of the EP
+|====
+
+The actions performed are:
+
+. Write dEPG ID corresponding to dMAC to register to allow matching on it during policy enforcement
+. Write expected outgoing port number to register. This can be a local or a tunnel port.
+. If outgoing port is a tunnel, also include an action to set the tunnel ID and tunnel destination to map the sEPG to the tunnel ID.
+. Proceed to policy enforcement
+
+Example flow for a local endpoint mapping:
+----
+table=$DEPG_FILTER, reg2=$BD, dl_dst=$MAC,
+actions=load:$DEPG->NXM_NX_REG3[],
+        load:$OFPORT->NXM_NX_REG4[],
+        goto_table:$ENFORCER_TABLE
+----
+
+Example flow for a remote endpoint mapping:
+----
+table=$DEPG_FILTER, reg2=$BD, dl_dst=$MAC,
+actions=load:$DEPG->NXM_NX_REG3[],
+        load:$TUNNEL_PORT->NXM_NX_REG4[],
+        move:NXM_NX_REG1[]->NXM_NX_TUN_ID[],
+        load:$TUNNEL_DST->NXM_NX_TUN_IPV4_DST[],
+        goto_table:$ENFORCER_TABLE
+----
+
+A flow hit indicates that both the sEPG and dEPG are known at this point at the packet can proceed to policy enforcement.
+
+A flow miss indicates that the dEPG is not known. If the network is in learning mode, generate a packet-in, otherwise drop the packet.
+
+====== Destination EPG Selection (L3)
+
+Much like L2 flows in the dEPG selection table, L3 flows map known destination IP addresses to the corresponding dEPG and outgoing port number.
+
+Additionally, flow hits will result in a routing action performed.
+
+[cols="1m,2",width="70%",options="header"]
+|====
+|Field|Description
+|priority=15|Must have higher priority than L2 but lower than ARP flows.
+|table=DEPG_FILTER|Flow must be in dEPG selection table
+|reg5=2|Must match on L3 context (NXM_NX_REG5)
+|dl_dst=GW_MAC|Packet must match MAC of gateway
+|nw_dst=PREFIX|Packet must match on a IP subnet
+|====
+
+The actions performed are:
+
+. Write dEPG ID corresponding to destination subnet to register to allow matching on it during policy enforcement
+. Write expected outgoing port number to register. This can be a local or a tunnel port
+. If outgoing port is a tunnel, also include an action to set the tunnel ID and tunnel destination to map the sEPG to the tunnel ID.
+. Modify destination MAC to the nexthop. The nexthop can be the MAC of the EP or another router.
+. Set source MAC to MAC of local default gateway
+. Decrement TTL
+. Proceed to policy enforcement
+
+Example flow for a local endpoint over L3:
+
+----
+table=DEPG_TABLE, reg5=$VRF, dl_dst=$ROUTER_MAC, ip, nw_dst=$PREFIX,
+actions=load:$DEPG->NXM_NX_REG3[],
+        load:$OFPORT->NXM_NX_REG4[],
+        mod_dl_dst:$DST_EP_MAC,
+        mod_dl_src:$OWN_ROUTER_MAC,
+        dec_ttl,
+        goto_table:$POLICY_ENFORCER
+----
+
+Example flow for a remote endpoint over L3:
+
+----
+table=DEPG_TABLE, reg5=$VRF, dl_dst=$ROUTER_MAC, ip, nw_dst=$PREFIX,
+actions=load:$DEPG->NXM_NX_REG3[],
+        load:$TUNNEL_PORT->NXM_NX_REG4[],
+        move:NXM_NX_REG1[]->NXM_NX_TUN_ID[],
+        load:$TUNNEL_DST->NXM_NX_TUN_IPV4_DST[],
+        mod_dl_dst:$NEXTHOP,
+        mod_dl_src:$OWN_ROUTER_MAC,
+        dec_ttl,
+        goto_table:$POLICY_ENFORCER
+----
+
+====== Policy Enforcement
+
+Given the sEPG, BD/VRF, and dEPG are known at this point, the policy is enforced in a separate flow table by matching on the sEPG and dEPG as found in the respective registers. Additional filters may be provided as specified by the policy. 
+
+[cols="1m,2",width="80%",options="header"]
+|====
+|Field|Description
+|table=POLICY_ENFORCER|Flow must be in policy enforcement table.
+|reg1=$SEPG|Must match on sEPG of packet
+|reg3=$DEPG|Must match on dEPG of packet
+|====
+
+The policy may require to match on additional fields such as L3 ports, TCP flags, labels, conditions, etc.
+
+The actions performed on flow hit depend on the specified policy and are described in the next section.
+
+Example of a flow in the policy enforcement table:
+----
+table=$POLICY_ENFORCER reg1=$SEPG, reg3=$DEPG, tcp_dst=DPORT/MASK,
+actions=output:NXM_NX_REG4[]
+----
+
+A flow miss indicates that no policy has been specified or the policy has not been populated. Depending
+on whether the policy population is proactive or reactive, the action on flow miss is either drop or
+notification of the controller/agent to trigger policy resolution.
+
+====== Policy Actions & Packet Rewrite
+
+The policy may specify multiple actions which are to be performed on matching policy classifiers.
+The following actions are supported:
+
+[red]*Accept*
+
+Forward/route the packet as previously selected in the dEPG selection table. This translates to
+executing the queued up action set and forwarding the packet to the port number stored in
++NXM_NX_REG4+ which represents the L2 nexthop.
+
+Basic example flow to allow an sEPG talk to a dEPG:
+----
+table=$POLICY_ENFORCER reg1=$SEPG, reg3=$DEPG,
+actions=output:NXM_NX_REG4[]
+----
+
+[red]*Drop*
+
+Disregard any previous forwarding or routing decision and drop the packet:
+
+----
+table=$POLICY_ENFORCER reg1=$SEPG, reg3=$DEPG,
+actions=clear_actions, drop
+----
+
+[red]*Log*
+
+The logging action is an extension to the drop action. It will send packet to the controller for logging
+purposes. The controller will then drop the packet.
+
+----
+table=$POLICY_ENFORCER reg1=$SEPG, reg3=$DEPG,
+actions=clear_actions, controller:[...]
+----
+
+[red]*Set QoS*
+
+The *Set QoS* action allows to modify the QoS mark of a packet. This includes the DiffServ field as well as ECN information. Note that this action may only be applied to IP packets.
+
+This action is typically followed by an allow or redirect action.
+
+----
+table=$POLICY_ENFORCER reg1=$SEPG, reg3=$DEPG,
+actions=mod_nw_tos:TOS, mod_nw_ecn:ECN, ...
+----
+
+[red]*Redirect / Service Redirection*
+
+Service insertion or redirection can be defined as an action between EPGs in the policy. It may occur transparently, i.e. without changing the packet in any way, or non-transparently by explicitly redirecting the packet to the service node.
+
+*Non-transparent Service Insertion*
+
+Non-transparent service insertion is used to redirect packets to a service such as a web proxy which requires the packet to be addressed to the service. The vSwitch forwarding behavior to achieve this is identical to a L2/L3 switching/routing action to any other EP.
+
+The specific action chain will depend on whether the service is located within the same BD or whether routing is required. The controller/agent is aware of the location of both EPs and will insert the required action set. The following is an example for a L2 non-transparent service redirection:
+
+----
+table=$POLICY_ENFORCER reg1=$SEPG, reg3=$DEPG,
+actions=mod_dl_dst:$MAC_OF_SERVICE,
+        load:$TUNNEL_PORT->NXM_NX_REG4[],
+        move:NXM_NX_REG1[]->NXM_NX_TUN_ID[],
+        load:$TUNNEL_DST->NXM_NX_TUN_IPV4_DST[],
+        action:output:NXM_NX_REG4[]
+----
+
+*Transparent Service Insertion*
+
+Transparent service insertion is used to redirect packets to a service such as a firewall which does not require a packet to be specifically addressed to the service. The service will be applied to all packets on the virtual network. This requires that the service only sees packets to which the service should be applied.
+
+The required forwarding behavior is to encapsulate the packet with the appropriate VNID. There is no need to rewrite any of the L2 headers.
+
+----
+table=$POLICY_ENFORCER reg1=$SEPG, reg3=$DEPG,
+actions=load:$TUNNEL_PORT->NXM_NX_REG4[],
+             move:$VNI_OF_SERVICE->NXM_NX_TUN_ID[],
+             load:$TUNNEL_DST->NXM_NX_TUN_IPV4_DST[],
+             output:$NXM_NX_REG4[]
+----
+
+The redirect action in the policy will specify the VNID and VTEP to be used.
+
+TBD: Does the pipeline always stop after a redirect action has been processed?
+
+[red]*Mirror*
+
+This action causes the packet to be cloned and forwarded to an additional port (port mirroring).
+
+[[openflow_renderer]]
+===== OpenFlow/OVS Renderer
+
+The OpenFlow renderer is based on the <<ovs_overlay,OVS Overlay>> design and implements a network virtualization solution for virtualized compute environments using Open vSwitch, OpenFlow and OVSDB remotely from the controller.
+
+The OpenFlow renderer architecture consists of the following:
+
+Switch Manager::
+Manage connected switch configuration using OVSDB.  Maintain overlay tunnels.
+Endpoint Manager::
+Optionally learn endpoints based on simple rules that map interfaces to endpoint groups.  Can add additional rules in the future.  Keep endpoint registry up to date.  If disabled, then an orchestration system must program all endpoints and endpoint mappings.
+ARP and DHCP Manager::
+Convert ARP and DHCP into unicast.
+Policy Manager::
+Subscribe to renderer common infrastructure, and switch and endpoint manager.  Manage the state of the flow tables in OVS.
+
+[[opflex_renderer]]
+===== OpFlex Renderer
+
+The OpFlex renderer is based on the <<ovs_overlay,OVS Overlay>> design and implements a network virtualization solution for virtualized compute environments by communicating with the OpFlex Agent.
+
+The OpFlex renderer architecture consists of the following main pieces:
+
+Agent Manager::
+Manage connected agents using OpFlex.
+RPC Library::
+Manage serialization/deserialization of JSON RPC with Policy Elements.
+OpFlex Messaging::
+Provides definition of OpFlex messages and serialization/deserialization into Managed Objects.
+Endpoint manager::
+Optionally learn endpoints based on simple rules that map interfaces to endpoint groups. Can add additional rules in the future. Keep endpoint registry up to date. If disabled, then an orchestration system must program all endpoints and endpoint mappings.
+Policy manager::
+Subscribe to renderer common infrastructure and endpoint registry and provide normalized policy to agents.
  
diff --git a/manuals/developers-guide/src/main/resources/images/Group-based_policy_architecture.png b/manuals/developers-guide/src/main/resources/images/Group-based_policy_architecture.png

new file mode 100644 (file)

index 0000000..34336e1

Binary files /dev/null and b/manuals/developers-guide/src/main/resources/images/Group-based_policy_architecture.png differ
diff --git a/manuals/developers-guide/src/main/resources/images/gbp_overlay_design_blue_to_red_tunnel.png b/manuals/developers-guide/src/main/resources/images/gbp_overlay_design_blue_to_red_tunnel.png

new file mode 100644 (file)

index 0000000..21bf89b

Binary files /dev/null and b/manuals/developers-guide/src/main/resources/images/gbp_overlay_design_blue_to_red_tunnel.png differ
diff --git a/manuals/developers-guide/src/main/resources/images/gbp_overlay_design_red_to_outside.png b/manuals/developers-guide/src/main/resources/images/gbp_overlay_design_red_to_outside.png

new file mode 100644 (file)

index 0000000..ed0440a

Binary files /dev/null and b/manuals/developers-guide/src/main/resources/images/gbp_overlay_design_red_to_outside.png differ
diff --git a/manuals/developers-guide/src/main/resources/images/gbp_overlay_design_red_tunnel.png b/manuals/developers-guide/src/main/resources/images/gbp_overlay_design_red_tunnel.png

new file mode 100644 (file)

index 0000000..a5f7d5f

Binary files /dev/null and b/manuals/developers-guide/src/main/resources/images/gbp_overlay_design_red_tunnel.png differ
diff --git a/manuals/developers-guide/src/main/resources/images/gbp_ovs_arp_optimization.png b/manuals/developers-guide/src/main/resources/images/gbp_ovs_arp_optimization.png

new file mode 100644 (file)

index 0000000..c2a9a17

Binary files /dev/null and b/manuals/developers-guide/src/main/resources/images/gbp_ovs_arp_optimization.png differ
diff --git a/manuals/developers-guide/src/main/resources/images/gbp_ovs_pipeline.png b/manuals/developers-guide/src/main/resources/images/gbp_ovs_pipeline.png

new file mode 100644 (file)

index 0000000..ad82c46

Binary files /dev/null and b/manuals/developers-guide/src/main/resources/images/gbp_ovs_pipeline.png differ
author	Ethan Spiegel <emspiege@us.ibm.com>
	Thu, 18 Sep 2014 09:34:44 +0000 (02:34 -0700)
committer	Ethan Spiegel <emspiege@us.ibm.com>
	Thu, 25 Sep 2014 01:19:15 +0000 (18:19 -0700)
manuals/developers-guide/src/main/asciidoc/groupbasedpolicy.adoc		patch \| blob \| history
manuals/developers-guide/src/main/resources/images/Group-based_policy_architecture.png	[new file with mode: 0644]	patch \| blob
manuals/developers-guide/src/main/resources/images/gbp_overlay_design_blue_to_red_tunnel.png	[new file with mode: 0644]	patch \| blob
manuals/developers-guide/src/main/resources/images/gbp_overlay_design_red_to_outside.png	[new file with mode: 0644]	patch \| blob
manuals/developers-guide/src/main/resources/images/gbp_overlay_design_red_tunnel.png	[new file with mode: 0644]	patch \| blob
manuals/developers-guide/src/main/resources/images/gbp_ovs_arp_optimization.png	[new file with mode: 0644]	patch \| blob
manuals/developers-guide/src/main/resources/images/gbp_ovs_pipeline.png	[new file with mode: 0644]	patch \| blob