An ecology of social categories.

This article proposes that meaningful social classification emerges from an ecological dynamic that operates in two planes: feature space and label space. It takes a dynamic view of classification, allowing objects’ movements in both spaces to change the meaning of social categories. The first part of the theory argues that agents assign labels to objects based on perceptions of their similarities to existing members of a category. The second part of the theory shows that an object’s perceived similarity to members of other categories reduces its typicality in a focal category. This means that for categories with a high degree of overlap with other categories in label space (lenient categories), the link between feature-based similarities and labeling weakens. The findings suggest that social classification will likely evolve to contain both constraining and lenient categories. The theory implies that this process is self-reinforcing, so that constraining categories become more constraining, whereas lenient categories become more lenient.

locate actors in comparative sets and delineate standards for evaluation.Social agreement about what constitutes membership in categories imposes constraint that induces actors to conform to categorical types.This is an instance of a general dynamic that has long interested sociologists: social structures constrain individual actors, then widespread conformity leads to the replication of the structures.
Research documents the homogenizing effects of social categories.Organizations adopt similar procedures, architectures, and even identities, not only because the practices are effective but also to conform to taken-for-granted ideas about how to organize.What drives this dynamic is that objects have less appeal and receive fewer rewards when they do not fit into an accepted category (Zuckerman 1999;Hsu 2006;Hsu et al. 2009;Leung and Sharkey 2014).In this view, complying with categorical imperatives provides benefits.As a result, actors conform to categorical standards and existing category codes get replicated.This rendering depicts agents as navigating a static category structure when deciding how to classify or make sense of objects and actors.
There is reason to question whether categorical structures tend toward stasis.The vibrant lines of research on categories and markets build on the view that audiences and market intermediaries create categories.Audience members come to agree on labels and codes as they seek to interpret action in a domain.Previous research has focused on how agreed-upon categorical meanings affect perceptions and evaluations.But the converse also holds: the characteristics of objects that get categorized shape meanings.For example, when telephone technology migrated from rotary-dial to touch-tone, the meaning of <telephone> evolved to include the new technology. 1 More generally, when an object changes its characteristics or gets reclassified, this potentially changes the meaning of one or more categories.This means that social categories are less durable than we generally assume.
The question of whether social structures affect the behavior of individual actors, or whether individual behaviors shape social structure, is pertinent to many areas in sociological research.Scholars studying institutions and organizational forms maintain that social structures are homog-enizing and self-replicating while acknowledging that these structures-and the meanings they invoke-frequently change.This tension has been extensively discussed as an important unresolved issue in organizational sociology (Romanelli 1991;Clemens and Cook 1999).That social structures are simultaneously rigid enough to induce widespread conformity, and malleable enough to change as a result of nonconformist behavior, presents an underlying tension in the literature.
Recent work on classification sidesteps this issue by assuming that category boundaries are policed by market intermediaries such as stock market analysts (Zuckerman 1999), credit agencies (Ruef and Patterson 2009), critics (Hsu 2006;Rao et al. 2003;Negro et al. 2010;Negro and Leung 2013), and government agencies (Ruef 2000).But what has historically interested sociologists in categorization is that macro-level structures emerge through uncoordinated individual beliefs about social meaning embedded in objects or concepts (Berger and Luckmann 1967;Durkheim 1931;Meyer and Rowan 1977).For instance, sociological studies of markets show that stable market structure can arise when organizations independently take market positions in response to positions taken by competitive rivals (White 1981;Porac et al. 1995).These ideas also underpin recent work that models the emergence of categories on the basis of audience members coming to agreement about schemas that capture the bases of similarity among objects (Hannan et al. 2007).
We suggest that meaningful classification can emerge even in an evolving environment comprising loosely coordinated actors.This occurs when people observe current categorization, make similarity judgments based on objects' features, classify them accordingly, and share their classifications.If the process is not coordinated, some objects might get reclassified frequently.This can change the meanings of categories, which then alters how objects with stable categorizations get interpreted.It might seem that this type of process would lead to chaos, with meanings and memberships changing too rapidly to coalesce into a coherent system of categories.In contrast, we show that classification can emerge even when objects change features and categorical meanings evolve.
We are struck that this will produce an ecological dynamic for social categorization that operates across two planes: feature space and label space.Feature space locates objects by their feature values.Label space contains labels for categories.The two planes are connected when agents apply a label to a profile of feature values.When an audience reaches consensus about the link between a label and a set of features, the label denotes a category.
Associations between category labels and features are straightforward in a static world.But categorization becomes more complex when we consider that the social meaning of categories can change.For example, when many category members alter their characteristics, this shifts how people conceive of the category.This means a focal category member who was once typical of that category might become atypical.Such an outcome characterizes obsolescence processes.A scientist who still teaches and researches what she learned decades ago might have been a highly typical member of her discipline as a new graduate but very atypical today.More generally, changes in feature values by some members can affect categorization for others.
A similar dynamic plays out for labeling.If atypical entities are assigned membership in a category-as when producers of processed foods start being labeled as <natural>-then the category will become less crisp.As a result, existing members may no longer be considered typical of the category.For instance, executives of organizations with nanotechnology capabilities concluded that their organizations should not be included in the <nanotechnology> category when the label was used so broadly that its underlying meaning no longer reflected their companies' capabilities (Granqvist et al. 2013).
We study these dynamics by building an explicit theoretical model of the coupling of feature space and label space.Our model relies on two processes, both based on similarity judgments.The first involves the similarity of a newly observed object to the current membership of a category.We argue (and show empirically) that feature-based similarities strongly influence label assignments even as actors change positions in both spaces.As objects move, conceptions of a category can change, and so can its membership.
But a link between features and labels persists.In the <telephone> example, the sets of features associated with this category have evolved from rotary dial to touch-tone to touch-screens and small hand-held electronic devices.The specific characteristics associated with the concept have changed, but it still retains a well-defined meaning.The second process underlying our model concerns the similarity of a focal category to other categories.Our theory and empirical findings imply that extensive overlap in category memberships weakens the link between feature values and label assignments.This can be seen in the example of <nanotechnology>, which evolved to overlap with a number of scientific disciplines including organic chemistry, molecular biology, semiconductor physics, and so on.Not only does the meaning of this category evolve, so too does the strength of the link between the label and set of features expected of category members.
In addressing these issues we employ a formal language proposed by Hannan et al. (2007) and further elaborated by Hsu et al. (2009) and Kovács and Hannan (2011).We think a formal theory is needed because these notions can be slippery and erroneous reasoning can easily result.Formal languages work against tendencies in natural language to skip steps and misidentify required assumptions, which can lead to imprecise conclusions.

Category Labels in a Domain
Labels play a central role in our analysis because they denote concepts, mental representations of similarity and difference (Murphy 2002).Although such representations need not be paired with labels, they often are.In the case of social classification, we generally expect concepts to labeled, otherwise people could not easily communicate about what they represent.Sociological analysis focuses on categories, where a label not only represents a concept for a particular individual but also conveys broad agreement about the pairing of a concept with a label, which gives the label a social meaning.Because people use labels to communicate about concepts and categories, much useful analysis can be done with primary reference to their labels.For this reason we can simplify the language of our argument by restricting the term "label" to refer to tags that are paired with meanings.We often refer to labels as shorthand for the concepts/categories that they designate.
We construct a theory for the general case in which agents, as audience, classify producers.Audience and producer are paired roles, and in many contexts the same actors play both roles.2That is, producers can be audience to each other.We do not specify a particular kind of audience because our assumptions plausibly describe processes that apply for many kinds of agents playing an audience role, including critics, consumers, rivals, and the producers themselves.
We build our argument at the level of an individual person as an audience member.This focus cuts a layer of complexity that arises because different audiences have different perspectives on classification (Pontikes 2012;Hsu and Elsbach 2013).For example, RottenTomatoes.comand IMDb.com agree in classifying Mel Brooks's sendup of the western Blazing Saddles as <comedy> and <western>.But Rotten Tomatoes classifies The Blues Brothers as <action/adventure> and <comedy>, while IMDb classifies it as <com-edy>, <music>, and <musical>.Our analysis considers the issues from the perspective of individual members of an audience.

Feature Space and Label Space
The received view of classification treats labels as tied to fixed sets of feature values.If the relationship between labels and features remains static, then it is sufficient to model a single categorical space that individuals and organizations navigate.But once we allow the meanings of categories to change-and the possibility of change to be contingent on how actors navigate the categorical structure-it is necessary to model the potentially variable connections between the spaces.We aim to explore how actors' movements affect the stability of social classification.So, the bedrock of our theory is a model of two spaces, feature space and label space, and the mapping that connects them.Feature space and label space are instanti-ations of what Gärdenfors (2004) calls conceptual spaces (see also Widdows 2004).
Research in psychology presents three standard accounts regarding the mental representation of concepts.A representation is stored as (1) a prototype, a "best example," (2) exemplars, known images of the concept, or (3) a schema, a structured view of a concept (Smith and Medin 1981;Murphy 2002).All three perspectives agree that an object's observable characteristics determine categorization.Research on social categories as part of a cultural system has concentrated on the third case, where concepts are stored as schemas, which tell the patterns of feature values that closely fit the concept (DiMaggio 1997).We follow this view in constructing models.
We represent producers by their positions in a feature space of observable characteristics.Producers can lie at varying distances from each other, and they move in this space when they change their feature values.In formal terms, we let F denote the space of the combinations of feature values relevant to agents in a domain. 3or this (and many other sociological applications) feature space can be represented as a weighted graph.Such a representation defines nodes as feature profiles, combinations of feature values.Suppose that there are K relevant features in a particular context.Then the set of positions is the set of an ordered K-tuple of feature values.For instance, if the features are binary, then there are 2 K positions (vertices in the graph space).We can specify this type of space using a graph relation that tells which vertices are accessible from which other vertices.To represent this, the crucial issue concerns how many features can be changed at one time.Analysis is greatly simplified if we assume that only one feature value can be changed at a time. 4This means that the edges in the graph connect positions that differ on only one feature value.
To fix ideas, consider Figure 1, which depicts the graph for a space defined for three binary fea-tures (with the restriction that edges connect only those vertices that differ by one feature value).Each of the eight vertices gives a profile of feature values.For example, (0,0,0) describes the position for which each of the three relevant features has the value of zero.The edges connect vertices that differ on only one value, for example, the vertex (0,0,0) is connected to (1,0,0), (0,1,0), and (0,0,1).The vertices in this diagram are what we call feature profiles.
In a standard graph space, distance is given by path length, specifically the length of the shortest path that connects a pair of nodes.In our construction, each edge is associated with a transformation distance, a real-valued weight that refers to the scale of the change required to move from one configuration to another (Hahn et al. 2003).We define a distance between positions (vertices) as the scale of transformation required to convert from one set of feature values to another.So the distance from (0,0,0) to (1,0,0) is the scale of the transformation needed to change the first feature from 0 to 1.If this feature is, say, the form of authority, then a much deeper organizational transformation is required to shift from traditional to rational-legal (to use two of Weber's types) than to change more technical features.
We treat distances as subjective judgments from the view of a focal agent and also allow the possibility that the structure of the space morphs over time.Precision demands that we tune the formal notation to fit these considerations by including argument slots for the agent and the time point in the distance function.Economy of expression argues the opposite.We use the more economical representation and omit from our notation the dependence of each predicate and function on these arguments.But each function and predicate should be considered as dependent on the agent and on the time point.With this convention we express distance in feature space as d F (v, v ), the scale of the transformation required to convert from configuration v to the configuration v .The pair F, d F defines the metric feature space.
The second space of interest is also a graph space, one whose vertices are labels.We refer to this space as a label space, represented as L. Distances in this space, denoted as d L (l, l ), tell the distances between labels.Again we allow the possibility that distances between adjacent (0,0,0) (0,0,1) We need to control the scope of label space, because social actors use many kinds of labels.Some of them are related, but others are not, for example, a person might be labeled as <Hungar-ian>, <logician>, and <archer>.This difference matters, because movements among related categories can affect the structure in a way that moves among unrelated categories do not.For theoretical analysis, it suffices to restrict the scope of the arguments to cases in which the language has a relatively simple structure.We use the notion of a domain, a super-category whose included categories compose the set to be analyzed (Hannan et al. 2007).Within such a structure, the schemas for all of the subcategories contain the codes for the supercategory's schema (along with other differentiating codes).These categories are related within the language of the audience that makes these distinctions.The scope of our theory is restricted to a single domain.
In addition, we restrict the scope of label space to include labels at the same hierarchical level.If we do not do this, then analysis of the interdependence of movements among categories gets very complicated without shedding additional insight.We limit the scope to sets of categories within a domain that satisfy the restriction that none of the categories considered in the analysis is a subcategory of any other.When we refer to a label, it should be considered to be the tag for a category that is an element in the common language of the focal domain.

Relation between the Spaces
Social classification emerges from the association of labels with features through schematization.A key element of our theoretical development-one not present in other work on social categorizationis that we define explicitly the mapping between features and labels.Our conceptualization builds on the following considerations about the nature of schematization: profiles (not particular feature values) associated with different labels do not intersect.

Schematization preserves distance, meaning
that distance in one space can be inferred from distance in the other.
We can make these abstract ideas somewhat more concrete with an example.Figure 2 depicts two graphs and a mapping connecting them.The feature space contains five feature profiles denoted a through f (each individual profile is a set of feature values, as illustrated in Figure 1).Again the edges connect profiles that differ on only one feature value.The spacing between vertices indicates the transformation distance between them.The label space contains three tags: 1, 2, and 3.The morphism we want to characterize maps a and b to 1 and c to 2, d and e to 3, and does not associate a label with f .The absence of edges connecting labels preserves the absence of edges between schemas.The set of schemas that have a common label are shaded in the figure.The dotted lines display the details of the mapping.
The next step is to provide a formal characterization of the relation between the spaces.Intuition suggests that the mapping be defined from feature space to label space because the existence of clusters of feature values induces labeling (Hannan et al. 2007).However, consideration of the list of desiderata (and the figure that exemplifies them) suggests otherwise.All labels are schematized, but not all combinations of feature values belong to a schema.We can obtain the first three desiderata by defining a one-to-one mapping from labels to sets of feature-value profiles.Such a mapping yields an isomorphism.If we add that the mapping preserves structure, namely, distance, then the isomorphism has the form of an embedding.
Definition 1 (Isomorphic embedding of sets of profiles in feature space in the space of labels).Let Φ ⊆ P(F ) denote a subset of the powerset of the feature-value profiles,5 and L denote the set of labels used by the focal agent for the entities in the domain.We call σ a distance-preserving isomorphic embedding of the power set of feature profiles in the set of labels if the following conditions hold: and The second clause ensures that the mapping preserves distance.It states, using formal language, that if the feature space distance between φ 1 and φ 2 is the same as the distance between φ 3 and φ 4 , the distance between their respective labels is also the same.This embedded isomorphism provides the following novel representation of a schema.
Definition 2 (Schema).An agent's schema for a category is the subset of the feature space that the agent associates with its label (under the isomorphic embedding): Sch(λ, l, y) ←→ λ = σ(l, y).
We must add an auxiliary assumption that the agents engage in schematization along the lines of the preceeding construction.
Auxiliary assumption 1.The agents in the audience apply labels to schemas in a way that corresponds to the isomorphic embedding, σ.
We restrict our analysis to situations in which the focal agent's schematization can be represented by a distance-preserving isomorphic embedding as in Definition 1. Instead of invoking the predicate Sch(λ, l, y) whenever we introduce a label in a formula, we use the notational shorthand that the symbol λ y refers to a schematized label for the agent y.
A schema might be a singleton, but generally schemas have higher dimensionality.We define the distance between a point in feature space and a schema using the standard definition of the distance between a point and a set: d F (f, λ) is defined as the minimum of the distances of the point to each member of the set.When it comes to the distances between schemas, we need a distance metric appropriate for sets: Hausdorff distance.In the case of the schemas λ and λ , this metric, denoted as H(λ, λ ), is constructed as follows.First choose the element in λ that is  closest to the set λ and record this (minimal) distance; then calculate the minimal point-to-set distance from λ to λ.The Hausdorff distance between the sets is the maximum of the two minimal point-to-set distances. 6 It follows from our construction that the distance between a pair of labels equals the Hausdorff distance between the schemas associated with them under the isomorphic embedding.This construction yields an implicit definition of distance in label space as the Hausdorff distances between the associated schemas.
This defines an agent's schema, which links feature profiles with labels.Whether particular sets of profiles become associated with a label is another question.Of course, a theoretical model cannot predict this, because the audience determines the sets of features that become a label's schema.Recent sociological work on classification 6 More formally, where inf and sup refer to the infimum and supremum, respectively.
has relied on experts, for example critics or government agencies, to assign objects to a categoryor to provide the category's extension. 7 Our argument does not require external agents to maintain categorical boundaries.Rather, we suggest that the link between the spaces builds on the characteristics of the objects that are already labeled (those in the category's extension).Consider the example in Figure 2. Suppose that there are three producers and that x 1 has the feature profile a, x 2 has b, and x 3 has c.Now suppose that a focal agent applies the tag 1 to x 1 , x 2 , and x 3 (so the agent's extension of this label consists of these three producers).What has happened here is that the focal agent has 7 Logicians and linguists define concepts in two ways: extensional and intensional.The extension of a concept refers to the set of objects that satisfy the concept in one fixed context.The intension of a concept refers to its meaning over possibly changing contexts (alternative possible worlds).The extension of a concept is an actual set of members; in the intentional view a concept has a more abstract characterization.For instance, consider how we would define the concept <prime number>.We can give its (partial) extension as {3, 5, 7, 11, . ..}.The standard meaning (or intension) of this concept is "a natural number greater than one that is divisible only by one or itself."reacted to the partial membership of x 3 in the agent's schema for the label and has assigned this producer membership.Other agents, on learning of the first agent's extension, might experience positive updates in the probability of associating the feature profile c with the tag 1.So the link between features and labels emerges endogenously from the positions of objects in each space.As objects change features and as label assignments change, the link between feature space and label space can also change.

Categorization: Assigning Labels to Producers
Next we turn to the issue of categorization: the assignments of labels to objects (producers in our case).We start by considering the link between feature space and label space.We then build our model for the standard view of category statics.We subsequently extend it to consider category dynamics.
Consider the producer's location in feature space, which we denote by f x .Refer again to the example in Figure 2.An agent will normally assign the tag 1 to a producer with feature-value profile f x = a.But a, as drawn in this figure, lies one step from b (in the same equivalence class) and one step from c (not in the class) and the d F (a, c) is not much greater than d F (a, b).Producers with the feature profile a plausibly could be assigned partial memberships in the categories tagged as 1 and 2. Likewise, the profile f (which is not associated with a label in this example) lies one step from c and one step from e-it stands between the two categories.Perhaps it will be assigned partial memberships in 2 and 3. So, a producer's distance from the schema associated with a concept influences whether it gets assigned full or partial membership.
Considering partial memberships marks a departure from classical understandings of concepts and categories.In the so-called classical perspective, categories are crisp: objects either bear a label or they do not, and they either fit a schema or do not.8For example, every natural number is either a prime or not, and no prime number is more prime than any other.But research in cognitive science over the past 35 years has found that people do not classify social and material objects in such a crisp fashion.Rather, they see some objects as having partial memberships, which means that concepts can have a typicality structure.For example, apples and oranges are judged as typical <fruit> but tomatoes and olives are atypical (Rosch and Lloyd 1978).In the case of markets, an independent business that serves espresso and sandwiches might be considered "sort of" or "technically" a <restaurant> but primarily a <coffee shop>.A software producer that develops spreadsheets might have partial membership in <database>.
The notion of a graded membership or typicality refers to the degree to which the focal agent regards a producer's feature values as fitting her schema for the label (or the similarity to the prototype).Producers with feature values close to the schema/prototype are said to be typical of that agent's concept.
In models of categorization, whether a person applies a label to an object is based on the distance between her mental representation of the object and her representation of the respective concept (e.g., schema or prototype).The higher an object's typicality, the higher the probability that the agent regards it as an instance of the associated concept (Rosch and Mervis 1975;Tversky and Gati 1978).For example, restaurants with similar menus get classified alike (Kovács and Johnson 2014).This means that whether an object gets assigned a label declines monotonically with its distance from the schema/prototype.We propose a simple functional form for this relationship.
Postulate 1.An object's typicality with respect to a concept is a negative logistic function of the distance of its feature profile from the agent's schema for the concept label: (In informal language, this postulate states that, normally, a producer's typicality to a schematized label l is a negative logistic function of the distance between the producer's feature profile and the audience member's schema for l.The formula specifies the existence of the positive constants included in the logistic function and that the formula holds for all labels and all audience members in the domain.) Here and elsewhere, N denotes a nonmonotonic quantifier.Formulas quantified in this way provide formal representations of generic rules (or rules with possible exceptions).Such rules tell what is normally the case.The semantics of this quantifier are spelled out in Pólos and Hannan (2004).In addition, we use an informal sorting of variables where l refers to a label, x to an object, and y to an audience member.

Statics
Postulate 1 has implications for labeling.First we develop the standard view, which we will call the static perspective.In this view it is assumed that both the category schema and the characteristics of objects do not change.The main question is how atypical objects are assigned membership in categories that have fuzzy boundaries.
Fuzziness in category boundaries arises because agents tend to form different extensions over occasions.Sometimes they will include an object in a category's extension and sometimes not (for objects with moderate typicality).Recognition of this tendency has led to the use of a probabilistic notion of categorization.Theory and research emphasize the probability that an object sits in the extension of a concept (over occasions).Let π(l, x, y) denote the probability that the agent y currently will apply the label l to the object x.We propose that this probability is proportional to x's typicality for the concept.
Postulate 2. The probability that an agent categorizes an object as a member of a category is proportional to its typicality with respect to the concept: This small argument implies that an object's feature-based distance from the relevant schema affects how it gets labeled. 10 Lemma 1.

Dynamics
Now we turn to dynamics.Here we allow that both category schemas and object characteristics can change.To model this, we need to shift from considering static probabilities of categorization to the hazards of adding and dropping labels.We turn to the question of how an object's typicality, based on its feature profile, and the changing properties of the categories affect the dynamics of labeling, the hazards of adding and dropping labels.We start by stating a straightforward extension of the typicality argument.
Postulate 3.An agent's hazard of newly applying a label to a producer normally increases with the typicality of the producer's profile of feature values from the schema for the label; and her hazard of dropping the label decreases with this typicality: Then it follows that the hazards are determined by the distance between a feature-value profile and the applicable schema.
Proposition 1.The hazard of an agent's newly applying a label to a producer presumably falls with the distance of the producer's profile of feature values from the agent's schema; and the hazard of the agent's dropping the label increases with 10 The implications of a set of rules with exceptions are the logical consequences of a stage of a theory.Such provisional theorems have a haphazard existence: what can be derived at one stage might not be derivable in a later stage.So the status of a provisional theorem differs from that of a causal story.The syntax of the language codes this difference.It introduces a "presumably quantifier, denoted by P. Sentences (formulas) quantified by P are provisional theorems at a stage of a theory (if they follow from the premises at that stage).sociological science | www.sociologicalscience.com this distance: Proof.The rule chain that connects the antecedent and consequent is a simple cut rule, which carries over from first-order logic to the nonmonotonic logic we use.The cut rule holds that φ → ψ and ψ → χ logically implies φ → χ.Postulates 1 and 2 imply that typicality declines with distance from the applicable schema.Our extension of the static typicality argument to dynamics (postulate 3) states that higher typically increases the hazard of adopting a label and decreases the hazard of dropping a label.This completes the chain that connects the antecedent and the consequent.The argument does not support a negative rule chain, a chain whose links run from the antecedent to the negation of the consequent.The absence of such a rule chain means that the implication stated in the proposition is proven.
For this proposition, we have not assumed that the schemas for labels remain fixed.Rather, an agent's schema is influenced by characteristics of the objects that currently belong to the category.This means that schemas change when members move within each space, and these changes influence subsequent categorization.Thus emerges a coupled ecology of market classification.Objects can independently change positions in feature space by changing their characteristics, or in label space by being reclassified, and this changes the link between feature space and label space.Altered category schemas then cause agents to change other categorizations, possibly leading to cascades of changes.

Leniency and Categorization
We might also expect that label-space properties affect labeling.In contexts that lack mechanisms for boundary enforcement, labels can become porous and have vague boundaries.This might modify the mapping between the two spaces.
Categories with vague, porous boundaries proliferate in the social world.Scholars who focus on the cognitive processes underlying categorization emphasize that categories organize knowledge and enable humans to structure their reality (Murphy 2002).But sociologists who study "onthe-ground" classification have noted that loose and overlapping categories appear in all types of classification systems, even those formally constructed by authorities, what DiMaggio (1997) called administrative classification.For example, the International Classification of Diseases (ICD) retained a number of ill-defined categories for causes of death, such as <eclampsia>, <convul-sion>, or <hemorrhage>, to guard against either having too many unclassified cases or inflating figures of "badly defined diseases" (Bowker and Star 2000).From a perspective that stresses the benefits of concepts in allowing economic and rapid cognition, such vague concepts might seem to be pointless and counterproductive.However, loose concepts exist precisely because classification addresses a messy reality.12Loose concepts and categories allow classification to incorporate entities that otherwise would be excluded, without jeopardizing the crispness of other concepts or categories.
We want to model how vague categories can affect the ecological dynamic introduced above.To do this, we need a formal definition of categories with vague, porous boundaries.We refer to categories that are close to many other categories as lenient. 13enient categories might not provide a basis for boundary enforcement.A key intuition is that lenient categories therefore provide lesssharp meanings.We argue that this tendency arises from confusion caused by the similarity of lenient categories to others.For example, entrants into <disk array producer> were already identified with many other categories, including <storage subsystems>, <RAID>, and <networkattached storage>.As a result, the label took on diffuse and variegated meanings (McKendrick et al. 2003).In the emergence of the nanotechnology field, the label became quite loose due to overlaps with several scientific disciplines, including molecular physics, materials science, computer science, and electrical engineering (Grodal 2011).The possibility that a welter of categories applies to some members makes it hard for agents to interpret these memberships.We develop this notion in terms of proximities in label space.
Our argument focuses on the distance of a focal category from the other categories in the domain, which we call total distance.Definition 3 (Total distance of a category from all others in the domain).
Lenient categories have low values of D. We do not suggest that any difference in leniency will be consequential.Rather, we propose that there is a threshold over which multiple overlaps make a category difficult for agents to interpret.The following meaning postulate formalizes this idea and provides an indirect definition of leniency.
Meaning postulate 1.If one category is much more lenient than another, then it normally is less distant from other categories in the domain: The relationship just postulated can hold for many values of w.We identify (and name as w * ) the minimum value for which the relationship holds: We suggest that leniency affects the link between feature-space positions and labeling.This is because categories do not exist in isolation.The vast research literature on concepts builds on the notion that categories play the cognitive role of grouping similar items and differentiating different items.Categorization concerns not only the match between an object and an agent's schema for one category but also the difference of the object from the agent's schemas for other categories.A metaphor can be seen in the concept of cue validity, which characterizes the diagnosticity of a feature value for membership in a particular category.Cue validity is based on both how much a feature is associated with a focal category and how little it is associated with others (Rosch and Lloyd 1978).Analogously, an object's fit to a schema is more diagnostic of membership if the schemas for other categories are not proximate.
Central to our argument is that having high proximity to many categories distorts perceptions of categorization based on feature values.To develop this formally, we first define, for producercategory pairs, the feature-space proximity between a focal producer and other producers assigned labels of all categories other than a focal category.
Definition 4 (Overall proximity to other-labeled objects at the producer level).
We then need to modify the standard assumption about how agents categorize (see Postulate 1).Specifically, we propose that high overall proximity weakens an object's fit to categorical schemas. 15ostulate 4. When the domain contains multiple labels, the typicality of a producer is a decreasing function of its proximity to the members of other categories in the domain and a decreasing function of its distance from the schema for the focal category: where N L,y denotes the number of domain labels that the agent has schematized.
The typicality function in postulate 4 approaches the function given in postulate 1 as P ↓ 0. As P increases, typicality falls toward zero.This means that the higher an object's proximity to other categories, the weaker is the effect of its distance from the schema on typicality.Figure 3 illustrates the effect of increasing proximity in terms of this function.The upper (solid) curve plots µ = (1 + 0.02e 2d ) −1 , and the lower (dashed) changes the value of the multiplier of the exponential term from 0.02 to 0.05, reflecting a much higher value of overall proximity.For the solid curve, where the producer has low proximity to members of other categories, an object at close distance to the schema results in a high expected typicality.For the dashed curve, where the producer has high proximity to members of other categories, an object at the same distance from the schema will have a lower expected typicality.
While the co-presence of postulates 1 and 4 in the theory would create an inconsistency in first-order logic, this need not be so for nonmonotonic logic.We resolve this potential conflict by relying on the most characteristic feature of nonmonotonic logic that more specific rules override less specific ones.By restricting the postulate to cases in which a domain contains more than one label, we have made postulate 4 more specific than postulate 1.This more specific relationship overrules the less specific one in the analysis of multilabel domains.
Next we tie the argument about labeling to leniency.The first step is to define producer's closest category.It simplifies the analysis greatly, without distorting the intuition, to assume that the closest category is unique for each producer.
Next we need to tie producer-level arguments to the category-level ones.We do so by considering pairs of producers that are alike in one important respect: the distance of their feature-value profile to the schema of their closest categories is the same.In other words, we consider pairs who fit the most applicable (best-fitting) schema to the same degree.But the pairs also differ in an important respect: their closest categories differ in total distance to other categories.
Postulate 5. Producers that are proximate to the schemas of categories that lie closer to other categories in the domain normally have higher total proximity to the members of all other categories in the domain (holding constant the degree of fit to the schema for the closest category): Together, postulates 4 and 5 imply that producers that are close in feature space to a lenient label are seen as less typical of that label, as compared to producers who are equally proximate to a constraining one.
Lemma 2. If two producers are equally close to the schemas for their closest categories and the closest category for one is sufficiently more lenient than the closest category for the other, then the producer with the more lenient closest category presumably has lower typicality in its closest category than does the other producer: Proof.The antecedent requires that the domain contains more than one label because it refers to a pair of labels that are applied to a pair of producers such that each bears one label but not the other.This means that specificity considerations dictate that the proximity-weighted typicality function (from postulate 4) applies.Given that the antecedent specifies that the producers' feature profiles stand at equal distance from the applicable schemas, any differences in typicalities depends only on the difference in total proximities for the producers.The antecedent also tells that the difference in the leniencies of the applied categories exceeds w * , which means that postulate 5 implies that the producer close to the lenient label has higher total proximity to the members of other categories.Inserting this inequality in proximities into the proximityweighted typicality function completes the chain linking the antecedent and consequent.In the absence of any opposing rule change given the premise set, this completes the proof.
It follows that the link between a producer's proximity to a label and categorization is stronger for constraining labels and weaker for lenient labels.
Proposition 2. Agents have higher hazards of adding and lower hazards of (newly) dropping the label of a producer's closest category when it is constraining as compared to when it is sufficiently more lenient: Proof.The rule chain supporting lemma 2 entails a difference in typicalities for the pair of producers.A chain rule with the extension of the standard typicality argument to the dynamic situation, postulate 3, provides a positive rule chain linking the antecedent and consequent.The theory as stated does not give rise to any negative rule chains that imply a different conclusion (a different ordering of the hazards of dropping labels).Thus the nonmonotonic proof is complete.
The argument that supports proposition 2 holds that a category's structural position in label space affects how strongly it is linked to feature space.Specifically, the relationship between feature-space positions and labeling weakens when labels are lenient.This provides a theoretical basis for predicting loose coupling of feature values and label assignments.
Perhaps the most interesting implication of this argument concerns the trajectories of leniency and constraint in classification.Our argument suggests that lenient categories become more lenient: agents are less likely to take into account feature-based similarities when applying lenient labels, which can lead to a diverse set of agents being assigned a lenient label.This will increase the proximity of its members to the memberships of other categories, which will fuel the cycle.This pattern does not hold for members of much more constraining categories.The "Discussion" section sketches the argument (for a fixed set of categories) that lenient categories remain lenient and highly constraining categories remain constraining.16

A Context: Market Classification in the Software Industry
We investigate this model in the empirical setting of the software industry.In the software industry, an informal classification of product markets has emerged through interaction among producers, analysts, the media, and consumers.It depends heavily on producer affiliation with market categories.Consumers, employees, and investors use category labels to find producers, products, and services, as well as to stay abreast of innovations.Organizations identify with labels primarily to communicate what they offer to various audiences.
In making the shift to a specific context, we also shift from a single agent to a collective, an audience.This is because we relate labeling actions by a focal firm to similar decisions by others.These analyses depend on an assumption that the language is common within the audience, otherwise it would be feckless to search for the kinds of effects our theory yields.The degree to which this assumption characterizes audiences of interest is an interesting and important topic in its own right (Pontikes 2012;Koçak et al. 2014).But we do not address it here.What is important is that the audience investigated, software producers, uses a common system of classification to make sense of the domain.
We represent label space based on affiliations with market categories made by software producers in press releases.Press releases are an important medium for an organization to convey what it does.Software organizations use press releases to distribute news and create a public face for media analysts, investors, and consumers.The major stock exchanges in the United States suggest or require that companies issue press releases to distribute information.Media often note press releases in their coverage: a study of public companies finds an average of 1.5 (median 1.3) media articles per press release issued from 2001 to 2006 (Soltes 2009).Product classification is based on attributes of a company's product or service, so labels in this context are verifiable by consumers or analysts.As a result, it is important for organizations to describe their activities accurately.Because press releases can be produced at low cost, even small and young organizations (which are difficult to track otherwise) issue them.For these reasons, press releases provide a good source of data to construct label space in this domain.
Software producers use category labels to describe themselves in press releases.For exam-ple, MicroStrategy identified itself in 1995 as "a leading supplier of client/server decision support tools," thereby affiliating with two category labels.The <client/server> label indicates that MicroStrategy's software would run on multiple client machines that communicated through a server.The <decision support> label conveys that the software provides reports that combine data from a number of sources to identify and solve problems and support business decisions (Figure 4 lists other examples of label affiliations from press releases in these data).In this context, producers typically develop a suite of software products targeted at one or more market categories, and the category affiliation is at the level of the organization.
We investigated whether labels claimed in press releases were also used by other important audiences.We found that the preeminent industry-analyst organization, Gartner, issued reports on over 50 percent of labels extracted from press releases, showing that press releases capture a common language for classification among audiences in this domain.Of the labels used in both press releases and in Gartner reports, over 75 percent appeared first in press releases.
We also investigated whether affiliations in press releases target a specific audience or whether they reflect the general identity of the organization.This question is especially important because different audiences can have different preferences regarding classification (Pontikes 2012).We compared claims claimed in press releases and company websites from the same time period (recorded in the archived web) and those claimed in annual reports (10-K forms) for public companies.Organizations identified with the same labels in press releases and on websites 71 percent of the time and in 10-K forms 81 percent of the time.This indicates that label assignments from press releases reflect an organization's general market identity.
This classification contains both lenient and constraining categories.Constraining categories evoke specific expectations.For example, <entertainment software> refers to producers of video game software.This category has a trade association, the Entertainment Software Association (or ESA), that tracks membership and promotes the interests of members, such as lobbying the government on issues like video game ratings.This Lenient labels are vague and have porous boundaries, but they are not necessarily marginal or declining in importance.In many cases, they grow large and become prominent.For example, the lenient <customer relationship management> label received 149 mentions in the Wall Street Journal from its first appearance in 1998 though 2008.But <electronic design automation>, a prominent constraining label, was mentioned only 33 times from its first appearance in 1988 through 2002 (its last mention).Some of the most important categories in this classification are lenient.
Our model requires that label space only include labels at the same hierarchical level.Labels analyzed should not be supercategories or subcategories of one another.This condition holds for the labels included in this analysis.Lenient categories included in this analysis are not superordinate categories.In a hierarchy, a superordinate category sits above all subcategories in a tree, such that if an organization belongs to a subcategory, it also belongs to the supercategory.This type of hierarchical relationship is uncommon in our empirical context, as is often the case for informal classification.For example, the extension of <business intelligence> overlaps with that of <sales-force automation>, but not all <sales force automation> members are labeled as <business intelligence>.Furthermore, some organizations only affiliate with lenient categories.Lenient categories overlap broadly, sometimes with more than 100 other categories; but the categories they overlap with are not nested.

Label Space
Testing these propositions requires data on label assignments.We use data complied by Pontikes (2008) based on identity statements in press releases issued by software organizations (for examples, see Figure 4).This identifies times at which organizations adopt new labels and drop existing labels.
A compilation of press releases issued between 1990 and 2002 in Businesswire, PR Newswire, and Computerwire with at least three mentions of the word "software" served as the initial source of data: 268,963 press releases.A combination of custom-coded programs for text matching and visual examination of the outputs of these programs yielded records for 4,566 software organizations that issued press releases during this period.The identity statements made by these organizations indicate category membership.Pontikes (2008) assembled an extensive list of software labels from articles in Software Magazine and Computerworld and from inspection of the firms' identity statements, and used text-matching programs to search all identity statements for these labels.This created a file of organizations' labels for each year during 1990-2002.The final data contain information on 456 labels and 4,566 organizations over 18,192 organization-years.

Feature Space
Our empirical analysis positions organizations in a feature space based on technical capabilities, which we call knowledge space.This space is a subset of all feature values that characterize organizations.We use patents to represent knowledge space.Patents do not capture the complete set of relevant features in this domain, but patented technology reflects an important subset of innovative activity among software producers.Patents have been widely used in previous research to identify an organization's technological position and to measure relative similarities in a technical space (Jaffe 1986(Jaffe 1989;;Podolny et al. 1996;Stuart and Podolny 1996;Sørensen and Stuart 2000).
Technological innovation has been at the heart of the software industry's growth, and a fair amount of inventive activity can be traced through patents.Patents influence venture-capital financing, IPOs, and successful acquisition (Mann and Sager 2007;Cockburn and MacGarvie 2009).Even small organizations patent, hoping to exclude competitors (Mann 2005).Despite controversy about the relevance of patents in the software industry, a sizable number of software organizations received patents (20 percent in the press release data). 17 Patents provide a good basis to measure positions in a technological feature space, for several reasons.First, they reflect an organization's technical capabilities.Second, a independent party verifies them.Third, the patenting process is very weakly influenced (if at all) by an organi-17 Patents have been controversial in software.Gottschalk v. Benson in 1972 ruled that software programs were algorithms that could not be patented.However, this decision was mostly overturned in Diamond v. Diehr in 1981, which found that a software program could be patented if it was embedded within an apparatus.After a series of cases that increasingly supported the patentability of software, a 1995 ruling essentially reversed Gottschalk, and the last barrier to patenting pure software was overturned in 1998.Despite this, the approval of software patents was a routine practice long before the courts recognized it (Cohen and Lemley 2001;Cockburn and MacGarvie 2009;Mann 2005).sociological science | www.sociologicalscience.com zation's product-market categorization.In this context, market categories are only loosely based on technologies.For example, <collaboration software> combines technologies based on video, GUI (graphical user interface), and mathematical algorithms.Feature-space data based on product evaluations by industry analysts would be much more likely to be influenced by how the organization positioned itself using product-market classification (not to mention that these data are not systematically available).A knowledge space based on patents captures a critical subset of feature space for this domain that can be measured independently from label space.Through citations, patents provide a historical record of knowledge-space positions and proximities.
The downside to using patents is that not all software organizations patent.We conduct our main analyses on the subset of organizations that are active in knowledge space (those that have previously patented).The press release data contain 789 organizations with at least one across 4,012 organization-years.We run additional analyses on all organizations in the press release data, coding those that are not active in knowledge space (those without patents) as having zero knowledge-space proximity to all labels.The results are consistent across both analyses.
Patent citations come from the National Bureau of Economic Research patent-data project (Hall et al. 2001).Data were updated to include patents through 2002. 18The U.S. Patent and Trademark Office issues patents for new, unique, and nonobvious inventions.All patents must cite the relevant "prior art" on which their invention builds, and the prior-art citations indicate the knowledge foundation for the patent at hand.Two patents that cite the same patent as prior art are more similar in knowledge space than pairs that lack common citations.
The patent office requires that inventors' claims be focused and narrow.Inventors must cite any relevant patents of which they are aware, and the patent examiner can also add citations to ensure comprehensive citation to prior art (Alcácer and Gittleman 2006).For some studies, it is important that the citations to prior art accurately reflect the inventor's knowledge; and citations added by the examiner are problematic.This does not pose a problem for our study, because we use patents to locate organizations in a knowledge space.Citations added by an examiner help refine the patents' positions.
Patent officers assign patents to a main class and one or more subclasses.There are about 400 classes.The National Bureau of Economic Research has created a higher-level classification system of six classes: Chemical, Computers and Communications, Drugs and Medical, Electrical and Electronics, and Other (Hall et al. 2001).This study uses all patents granted in the Computers and Communications class between the years 1990 and 2002 to construct knowledge space for the software industry.This broad class includes patents relevant to software.
Following Pontikes (2008), we construct knowledge space using a five-year window of all patents in the Computers and Communications class.We use all patents relevant to software, including those issued to software organizations, individual inventors, universities, and nonsoftware organizations over a moving five-year window, including patents that were applied for (that were subsequently granted) in the current year and four years prior.

Independent Variables
Here we provide the details about the measurement of the two key explanatory variables.
Leniency.To represent leniency, we need to measure the proximities among categories in the domain.We adopt Pontikes's (2008) measure, which uses category overlaps in label space to indicate similarity between categories.A category is more similar to another in label space if many of its members also belong to the other category.
Leniency is the product of contrast and a positive function of the number of categories with which the focal category overlaps.Contrast is the average label-based typicality of its members.It measures the extent to which a label comprises typical or atypical members (Hannan et al. 2007). 19To measure contrast, we first must compute label-based typicalities, ϑ(l, x, y) from label assignments.We assume that an object that is assigned to multiple categories is less typical of each and that this diminution is exacerbated when the categories lie far apart in the label space.For example, an establishment that is both a French restaurant and a car wash would be very atypical of each label.
In accordance with our theory, we take into account producers' partial affiliations with labels.What we call label-profile typicality, denoted by ϑ(l, x, y), is based on (1) the of times a producer affiliates with a focal category in a given time period, compared to the number of times it affiliates with other categories, and (2) whether a producer is a member of multiple similar or multiple distant categories.The more a producer affiliates with a focal category, the higher its labelprofile typicality.The less it affiliates with other categories-especially other dissimilar categories, the higher is its typicality in the focal category.Label-profile typicality ranges over [0,1].This construction follows Kovács and Hannan (2011) but uses a generalization that builds on Pontikes (2008).The first step calculates the similarities of all pairs of categories, in notation sim L (l, l , u), based on how many organizations are assigned membership in both categories in the time interval u.The second step uses a Shepard (1987) construction to calculate distance between a pair of categories based on label profiles, in notation, dL (l, l , u), as a Euclidean distance, as follows: (1) We use measured similarities (based on co-occurrence) and a setting for the parameter h to obtain d L .After experimentation with values for h ranging over the integers from 1 to 5, we found that similar patterns of effects of theoretically relevant variables (significance levels do vary, however).We set h = 3, which provides the best overall model fit.The next step calculates typicality, ϑ(l, x, u), from label profiles using the idea that a producer's typicality in applied categories declines with the application of each new label, and the decline is greater when the newly added label lies far from the already applied categories.We calculate label label with a hundred typical members and one member with broad overlap would appear as lenient.
profiles by taking into account that firms often release multiple press releases at multiple times during a year, and each provides information on labeling.Rather than only including whether a firm claims a label in a given time period, we weight the relative frequency of claims to each label.For example, if an organization claims label A in one press release and B in three press releases in the same year, for a total number of 4 label mentions, then we calculate l(A, x)=1/4 and l(B, x)=3/4.We calculate ϑ as follows: .
(2) A label's contrast is the average label-based typicality of its members:21 We calculate leniency as the product of inverse contrast and a positive function of the number of categories overlapped: where n l,u denotes the number of other categories whose memberships overlap with l.The function ψ(n l,u ) is computed based on a summation of the distance between l and each label l with which it overlaps.We use a log transform because the interpretability of a label ought to drop more sharply when the number of overlaps rises in the lower range than at higher levels.Our decision to weight the number of categories overlapped by their distances from the focal label follows the spirit of the construction of ϑ, namely, that combining dissimilar categories causes a greater loss in distinctiveness than combining similar ones.
We separate leniency into two pieces, instead of using the continuous measure, because the key propositions depend on one label being sufficiently more lenient than another.We break the distribution of leniency at 2.7, the approximate mid-point of the leniency of producers' labels, and treat high (low) leniency as values above (below) the breakpoint.Producer's proximity to labels in knowledge space.The other main variable to be defined is the proximity (inverse distance) of a producer's position in knowledge space from a label.Recall that we assume that the hazards of adding and dropping labels are negative exponential functions of the distance of an object's position in feature space from the schema for the label (Postulate 4).We lack knowledge of schemas.So we proceed indirectly by measuring a label's position in feature space in terms of the set of positions of the objects that lie in the label's extension.Then we measure an object's position in feature space relative to the label's extension.
We measure positions in knowledge space using their patents' prior art citations.Patents that share citations build on similar knowledge.Therefore we use citation overlaps to position patents relative to one another in knowledge space (Podolny et al. 1996).We use the patent citation network of all patents in the Computers and Communications class (Hall et al. 2001).We measure similarity from prior-art citations (Podolny et al. 1996;Podolny and Stuart 1995).The similarity of patent i to patent j, α ij , is defined as s ij /s i , where s ij denotes the number of shared citations between patents i and j, and s i denotes the total number of citations by patent i. 22 We can capture more fine-grained detail by also considering 22 We have the choice of whether to use asymmetric or symmetric measures of similarity.With asymmetric similarities, patent i can be more similar to patent j than j is to i.Some research in cognitive psychology supports the use of asymmetric similarities.Studies show that people's assessments of similarity is asymmetric across a number of domains (countries, figures, letters, signals) (Tversky 1977).Competition can also be asymmetric, for example, a specialist competes more strongly with a generalist than vice versa; because of this asymmetric measures have previously been used to define technological niches among firms using patent citation data (Stuart and Podolny 1996).There are also reasons to use symmetric similarities (defined as the number of overlapping citations divided by the number of citations made by both i and j: α ij = s ij /s i s j ).Symmetric similarities translate into symmetric distances in feature space, which means that distances between two points in feature space will not depend on directionality.This is an important property for our formal model of feature space and label space.The choice does not materially affect our results.Weighing these options, we chose to use asymmetric patent similarities in our primary analysis to comport with standards of existing empirical research.second-degree similarity, which takes into account whether a patent stands within two degrees of separation from another patent.This brings into the picture common dependence on relevant patents that were issued to nonsoftware organizations such as universities.We compute second-degree similarity between of patent i to k by multiplying their similarities to all third patents, j, and choosing the j that maximizes the similarity of i and k: ρ ik = max j|j =i,j =k (α ij • α jk ).
The degree to which interested agents regard a patent as tied to a label presumably depends on how strongly the organization that issued the patent is affiliated with the label.If the inventor firm has a low level of typicality in a label, its patents should similarly be considered only partially representative of the label.To reflect this view, we define a patent's typicality in a label as equal to the label-based typicality in l of the firm to which the patent was issued, in notation ψ(l, i, u) = ϑ(l, x, u) I(i, x, u), where I(i, x, u) is an indicator variable equal to one if patent i is a member of the set of organization x's patents (in notation p xu ) and equal to zero otherwise.The proximity (inverse distance) of an organization's position in feature space to a label, p(l, x, u), is where the second summation is restricted to the patents affiliated with the label l.We also use Equation 5to calculate the proximity (inverse distance) of a producer's position to all labels that it currently claims, p C (x, u), and to all labels it currently does not claim, p N C (x, u) (to be used as control variables).Because of skew in the distribution of the distances, we use (natural) logarithmic transformations of these variables in our empirical analysis.23A producer's distance from a label's feature profile can change in three ways: (1) it can move away from a stationary profile, (2) other producers can move while the focal producer remains in place, or (3) both the focal producer and the other profile members move but in different directions.To explore this in supplementary analyses, we also create a measure of an organization's stationarity in knowledge space based on citation overlap between the organization's patents in the current year and its patents in the previous four years.

Controls
Social factors, such as whether a category is resource-rich or otherwise popular, can also affect label adoption.People and organizations sometimes engage in herding behavior, where high levels of adoption in one period lead to increased rates of adoption in the next period (Strang and Soule 1998).This effect can result either because agents follow each other's actions, as in the case of fads (Lieberson 2000;Meyersohn and Katz 1957), or because the number of organizations in a category signals an unmeasured factor, for example demand for a specific product.To account for these effects, we include the label's fuzzy density (the number of organizations that are members of the label weighted by typicality) to control for the label's size in the previous period.The rate at which a label is catching on can also affect adoption patterns (Berger and Le Mens 2009).Therefore we also control for the number of new affiliations with the label and number of drops in the previous period, both weighted by typicality.We also include controls for the leniency of the label and tenure of the label measured since the beginning of our records, 1990.
Organization-level controls include the number of labels in which it has nonzero membership, the number of other labels in the industry, the count of its patents (to account for general inventive activity), and the time since the organization has last added or dropped any label.Venture capitalist investors may influence an organization to add or drop market categories (Pontikes 2012).Therefore we also control for whether the organization has recently received venture capital financing.Because old and large organizations might not pay as much attention to feature-based fit, perhaps because they receive less scrutiny, we control for the organization's tenure in our data, to account for age, and for whether it was ranked in Software Magazine's top 500 software companies (based on revenue) to account for size.To test whether there is a systematic relationship with knowledge-space proximity to a label, we also estimated specifications that include interactions between these variables and proximity. 24ables 1 and 2 provide descriptive statistics for the independent variables in our analysis.

Dependent Variables
We analyze events of adding and dropping labels in continuous time.We code adding a label as occurring in the first year in which an organization lists that label in press releases, and the dropping of a label as occurring in the last year an organization lists it (if the observation is not right-censored at that time).25 Seventy percent of organizations add and drop at least one label during the time period studied.Ninety-five percent of patenting organizations add and drop at least one label.Twenty percent of all organizations and 45 percent of patenting organizations add more than five labels.Twenty percent of all organizations and 40 percent of patenting organizations drop more than five labels.

Stochastic Specifications and Estimation
We test these hypotheses using event-history analysis.We estimate the effects of proximity and leniency on the hazards with standard piecewiseconstant specifications.We updated covariates for each time piece.In both forms of analysis, pieces are defined for less than one year, [1-2) years, [2-4) years, and 4 years or greater.Standard errors are clustered by label.
Adding labels.In analyses of the hazard of adding a label, the risk set consists of all dyads of an organization that has previously patented and one of the labels with which it is does not affiliate.There are 337,800 organization-label dyads over 1,434,096 organization-label-years, and 5,266 events of adding a label.We also conduct additional analyses using data including all organizations in the press release data, whether they patent or not.In this case, we assign nonpatenters knowledge-proximity scores of zero for all labels.This larger data set contains 1,893,569 organization-label dyads over 6,485,521 organization-label-years and 15,103 events of adding a label.Dyads enter the model in the first year the organization or label enters the data (whichever comes later).Duration is the time since the organization-label dyad exists in these data.
Dropping labels.In analyses of the hazard of dropping a label, the risk set contains all dyads of an organization that has previously patented and their labels: 5,967 dyads over 10,908 years, with 4,397 events of dropping a label.Again we also included the nonpatenters in additional analyses; the entire data set contains 21,589 organizationlabel dyads over 39,359 organization-label-years, with 13,880 events of dropping a label.We analyze spells during the years 1990-2001.Dyads enter the risk set in the first year the organization affiliates with the label.Duration is the time elapsed since the organization first affiliates with the label.

Hypotheses
We test four hypotheses in this empirical context based on our theoretical propositions.The first and second hypotheses concern the relationship between feature-space positions and label affiliations (prop.1).
Hypothesis 1.A producer's hazard of adding a label increases with its proximity in knowledge space to the cluster of producers associated with the label.
Hypothesis 2. A producer's hazard of dropping a label decreases with its proximity in knowledge space to the cluster of producers associated with the label.
The third and fourth hypotheses concern the effects of leniency (prop.2).
Hypothesis 3. Proximity in knowledge space to the producers associated with a constraining label has a stronger positive effect on the hazard of adding that label than does proximity to a more lenient label.
Hypothesis 4. Proximity in knowledge space to the producers associated with a constraining label has a stronger negative effect on the hazard of dropping that label than does proximity to a more lenient label.

Adding Labels
We analyze the effects of knowledge space proximity on the hazards of adding a label.Table 3 reports estimations on the hazard of adding a label for the independent variables in the analysis (all estimations include controls; full models available upon request).The first model in Table 3 reports tests of H1.This analysis considers dyads of organizations that have patented and the labels that they do not already claim.The results support H1: an organization's knowledgespace proximity to a label has a positive effect on the hazard of it adopting this label (p < 0.01 in a z-score test).An organization that is two standard deviations above the mean on proximity to a label is 10 percent more likely to adopt the label, as compared to one at mean proximity, according to this estimate.
Next we analyze whether the effects of proximity depend on leniency.The second model of Table 3 provides tests of H3.The results support this hypothesis: proximity to a constraining label has a strong positive effect on the hazard of adding it (z-score test significant at p < 0.01).However, the effect of proximity is smaller and insignificant when the proximate label is lenient.This specification fits better than the one in model 1 (the likelihood-ratio test is significant at p < 0.01).This indicates that feature values matter less for affiliation with lenient labels.
We also explored some possible confounding effects.One question might be whether proximity drives the effect, or whether it just reflects the consequences of general exploration in the space.To test against this alternative, we include an effect of the organization's proximities to all other labels it does not affiliate with, and to all other labels it does affiliate with (model 3).The results show that general exploration in knowledge space has a positive effect on an organization's propensity to add any label.But the effect of exploration is smaller by an order of magnitude as compared to the effect of proximity.An organization's proximity to labels it already claims does not have a statistically significant effect on adding a new label.Importantly, our main results persist.Estimations run on all organizations in the press release data, where nonpatenters are assigned knowledge proximity zero, gives similar results (model 4).Results are also similar when we control for leniency using a binary variable that indicates whether the organization affiliates with a high-leniency label, defined using the same cutoff.

Dropping Labels
The next set of analyses explores the effects of knowledge-space proximity on the hazard of dropping a label.Table 4 reports estimations on the hazard of dropping a label for the independent variables in the analysis (all estimations include controls; full models available upon request).The first model in Table 4 provides a test of H2.This analysis considers dyads of organizations and labels that they affiliate with.The results support H2: the higher an organization's proximity to a label in knowledge space, the less likely it will drop its affiliation with that label (p < 0.05 in a z-score test).This specification also includes an effect of the organization's knowledge space proximity to all other unclaimed labels.This effect is positive.In some estimations (Table 4, models 4 and 5), the effect is positive and significant.Together, these results illustrate the dynamic link between feature space location and label assignments: the closer is an organization is to a label it already claims, the less likely it is to drop this label.The closer it is to other labels it does not claim, the more likely it is to drop a claimed label (and adopt the new label to which it has become proximate).
The second model in Table 4 provides tests of H4, concerning the effects of leniency on hazards of dropping a label.The coefficients have the predicted signs but they do not differ significantly according to a chi-square test.Consistent with the hypothesis, proximity to a constraining label has a negative and significant effect on the hazard of dropping (z-score test significant at p < 0.01), whereas the effect for lenient labels is weaker and not significant.To explore this effect further, we increased the leniency threshold from 2.7 to 3.0, sociological science | www.sociologicalscience.com Note: All specifications include: label controls for leniency, fuzzy density, recent adds, recent drops, and label tenure (since 1990); organizational controls for the number of labels the organization claims, the number of other labels in the industry, a count of the organization's patents, tenure (since 1990), whether it ranked in the Software 500, recently received venture capital financing, and the time since adding any label; year fixed effects and duration pieces.Patenters: org-label dyads over 1,434,096 org-label-years; 5,266 events of adding a label.All orgs: 1,893,569 org-label dyads over 6,485,521 org-label-years; 15,103 events of adding a label.* p < 0.05; † p < 0.01 and the results strongly support the hypothesis (model 3).This effect persists when we include a control for an organization's proximity to the other labels it already claims (model 4).An estimation on all organizations in the press-release data (model 5) also shows similar effects.In addition, consistent results hold when we control for leniency using a binary variable.This analysis highlights the importance of the existential quantification in the definition of leniency: the predictions about leniency hold only provided that one category is sufficiently more lenient than another.In this analysis, these effects are evident when the leniency threshold is increased, suggesting that there exists a leniency threshold above which the link between the spaces breaks down.Overall, this analysis provides some support for H4.Together with the results reported here, it provides additional evidence that leniency weakens the relationship between feature values and label affiliations.It also suggests that feature-space proximity has a stronger influence on the propensity to drop a lenient label as compared to the propensity to adopt a lenient label.
Proximity to other labels the organization affiliates with has a negative effect on the hazard of dropping the focal label, and it is significant in analyses of all organizations (z-score test significant at p < 0.01).Again, the size of the effect is substantially smaller than the effect of knowledgespace proximity to the focal label (from the dyad).This indicates that there might be some symbiotic effects in terms of the feature profiles of labels a producer simultaneously claims.
One potential concern is that organizations might generally be more proximate to lenient labels and that this drives the results in both the estimates for both adding and dropping labels.Our data show that the distributions of Note: All specifications include: label controls for leniency, fuzzy density, recent adds, recent drops, and label tenure (since 1990); organizational controls for the number of labels the organization claims, the number of other labels in the industry, a count of the organization's patents, tenure (since 1990), whether it ranked in the Software 500, recently received venture capital financing, and the time since dropping any label; year fixed effects and duration pieces.Model 3 imposes a higher leniency threshold (see text).Patenters: 5,967 org-label dyads over 10,908 org-label-years; 4,397 events of dropping a label.All orgs: 21,589 org-label dyads over 39,359 org-label-years; 13,880 events of dropping a label.* p < 0.05; †; p < 0.01 knowledge-space proximities to constraining and lenient labels are similar. 26 Together these results for dropping labels and those reported for adding labels show how featurebased proximities shape categorical affiliations in a dynamic context.These effects are illustrated in Figures 5 and 6.For constraining labels, there is a strong relationship between an organization's proximity to the label and an increased hazard of adding the label, and decreased hazard of dropping the label.For lenient labels, the link is weaker.

Additional Tests
We conduct additional tests of these results.Our model proposes a coupled ecology between feature space and label space, where both positions of organizations and meanings of categories shift 26 See Tables 1 and 2. over time.Results show that the relative distance in feature space between an organization and category representation affects whether an organization adds or drops the category label.We further explore this by including in the estimations a measure of organization stationarity, which measures the percentage of the organization's current year citations that the organization has previously cited (in the past four years).Organizations that score low on this measure have moved more in knowledge space, whereas those that score high are more stationary.We include stationarity and an interaction between stationarity and knowledge proximity to the focal label as covariates.Results show that neither stationarity nor the interaction have a significant effect on an organization's propensity to add or drop labels.Effects reported here persist (results available upon request).This suggests that the reported effects do not simply reflect an organization's  3, model 3).movement in feature space toward a stationary category.
We also conduct additional tests to further control for social factors that might affect an organization's propensity to add or drop labels.We control for the number of organizations in a label that have recently received venture-capital financing.Such investments signal that an area is promising and might give managers hope that additional funds will follow.The hypothesized effects are robust to the inclusion of this control.

Constraint or Accident?
We noted that an observed pattern of label overlap can result from accidental circumstances or from constraint.We suggest that these patterns are not accidental and that they reflect differences in constraint among labels.But if constraint is the source of variation in the strength of categorical boundaries, where does this constraint come from?Here we sketch an extension of the theory that links category overlap to appeal for producers that change positions in label space.
Suppose that an audience member lacks detailed information about the feature values of two producers but does observe label assignments. 27onsider two producers that belong to different categories, one constraining and one much more lenient with equal (high) typicality in their assigned category.Suppose that each is assigned to some third label.Recall that our argument implies that constraining categories normally lie further from other categories then do lenient ones (if the difference in leniency exceeds a critical value).Then the definition of the typicality in a label (ϑ) implies that the member of a constraining category generally experiences a greater decline in its typicality its initial category than is the case for the member of a lenient one.In other words, taking on new label assignments generally reduces typicality more for members of constraining categories.This difference matters when the categories involved have positive valuation.This means the intrinsic 28 appeal of a producer to a typical audience member is a positive increasing function of its typicality in the meaning of the relevant label (Hannan et al. 2007). 29So long as the concepts are positively valued, producers affiliated with constraining labels lose more intrinsic appeal when they span concepts than is the case for those affiliated with more lenient ones.
Proposition 3. Suppose that one producer belongs to one but not the other category and a second producer has the mirror image pattern of memberships and the typicalities are equal.If 28 "Intrinsic" in this context refers to degree of fit to the agent's aesthetics.The theory on which we build holds that it takes engagement of the audience by the producer to convert intrinsic appeal to actual appeal.
29 Producers can have high (or low) typicality in constraining or lenient categories.The fact that appeal increases with a producer's typicality in a category need not imply that audiences prefer constraining categories.Previous research shows that venture capitalists, an important audience in the domain of our empirical study, prefer producers in vaguely bounded categories (Pontikes 2012).
each producer adds the same third label, then the member of the constraining category ends up with lower intrinsic appeal in that label as compared to the member of the more lenient category.
In this sense, members of constraining categories have more to lose when they adopt feature values that cause them to get assigned some additional category membership(s).This cost is the source of the constraint.

Dynamics of Leniency
We proposed that proposition 2 implies that lenient labels tend to become more lenient and constraining labels get more constraining.Here we describe how to derive these implications.We use an illustration of a domain with one constraining label and three lenient ones in Figure 7. (For simplicity of representation we use a Euclidean space in this illustration.)In each case, the shaded circle represents a schema (the set of feature-value profiles that lie in these circles are the schemas).The outer circles indicate the locations of the producers assigned the category label.
Category A is constraining because its members more closely fit the schema (the outer circle is closer to the shaded one than is the case for the other categories) and because its membership does not overlap the memberships of the others.Categories B, C, and D are fuzzier (many members far from the schemas) and all three pairs overlap.
Now consider three producers whose feature values place them close to the producers assigned one of the labels.As drawn, producers x 1 and x 2 stand at the same distance from the schema for A. But x 1 lies further from the memberships of the other categories.Thus, according to the theory, x 1 has higher typicality in A than does x 2 and therefore has a higher hazard of being labeled as an A. This means that the expected waiting time to labeled membership is lower for x 1 .In typical cases, x 1 will be labeled an A before x 2 .If this occurs, then the membership of A drifts to the left, away from the memberships of the other categories.
Next compare the situations for x 3 and x 4 .As the illustrative figure is drawn, x 3 lies outside the membership of each category.In this respect it is similar to x 1 and x 2 , and nearly the same reasoning applies.This producer (x 3 ) is slightly further away from the schema for D than is x 4 ; and x 3 is less proximate to the memberships of all categories than is x 4 .So, there is a high likelihood that x 3 will gain the label D. If this happens, it will cause the membership of D to move to the right, away from the other categories, similar to the situation with category A. But x 4 is much closer to the schemas for B and C (and A for that matter) than is x 3 .It has a high hazard of becoming labeled as a member of B. If such an event does occur, this creates a membership overlap where one did not exist (as the figure is drawn).Obviously this new overlap increases the leniency of B and D. The situation is more complicated than in the analysis of A, and the net result depends on the flow of the stochastic events.But there is no clear pattern as was the case for the constraining category.
We estimated growth models (at the category level) for leniency, specifications in which leniency in a year depends linearly on leniency in the previous year, the fuzzy density of the category, its tenure, and yearly fixed effects.With either category fixed effects or category random effects, we find a positive and significant effect of leniency on the growth in leniency.In other words, the higher is a category's leniency, the higher is its growth rate in leniency.
These dynamics likely feed back to the schemas.For instance, as a constraining category drifts away from the lenient ones, what is typical of this category is also shifting away.If, as we expect, audience members try to capture typicality in forming schemas, then the schema will also drift away from the others.This makes the initially constraining category even more distinctive.
If a lenient label remains lenient or becomes more so, then the broad (or broadening) overlap lowers the chance that the audience reaches a consensus about the association of a schema with the label.Put differently, increased leniency diminishes agreement about the meanings.Pockets might develop where audiences have experience with organizations that have one type of overlap with a lenient label, and they might develop a schema with respect to that subset of activities.This likely conflicts with the schema developed by another audience exposed to other organizations that are members of the lenient label but have a different set of overlaps.This might lead to a lack of consensus about the label's schema for relevant audiences.Hence lenient labels might evolve to have ambiguous and uncertain meanings.At best their schemas are defined at a very general level.
This dynamic is reflected in the trajectory of a number of prominent lenient categories in our analysis.For instance, this is the process James Kobielus refers to in his "What's not BI?" blog, when he states that "almost every data management technology has been swept into BI's gravitational orbit at one time or another by somebody somewhere" (Kobielus 2010).Gartner comes to a similar conclusion in their 2006 magic quadrant report on <business intelligence>, noting that <BI> is increasing in scope to encompass more types of users, accessing more information sources and a wider array of applications.<BI> becomes even more lenient over time.In Gartner's (2006) report, they define <business intelligence> as "the mission of BI . . . is the access to and analysis of quantitative information sources to deliver insight that empowers decision makers" (Schlegel et al. 2006).Their 2008 report provides an even less constraining definition that does not specify that the platform should connect the user with disparate information sources: "BI platforms enable users to build applications that help organizations learn and understand their business" (Richardson et al. 2008).
A similar trend is apparent in the evolution of <enterprise resource planning (ERP)> and <customer relationship management (CRM)>.Initially lenient categories, both have increased in leniency over time to the point that they no longer even pose the constraint that category members produce software.Gartner's (2010) report on <ERP> describes the market category as having evolved from a technology product to a strategy: In the original definition, ERP systems' functionality normally covers the following areas: finance and accounting . . .purchasing, human resource management, sales or customer order management, and operations management.However, Gartner now defines ERP in a broader sense as "a technology strategy in which operational business transactions are linked to financial transactions, specifically general ledger transaction."(Hestermann et al. 2010) The editors at CRM magazine similarly describe <CRM> as having evolved from a type of software to a business philosophy: CRM, or Customer Relationship Management, is a company-wide business strategy designed to reduce costs and increase profitability by solidifying customer loyalty. . .Once thought of as a type of software, CRM has evolved into a customer-centric philosophy that must permeate an entire organization.(CRM Magazine 21 February 2010).
These examples reflect the dynamic we described: atypical members are more likely to be assigned membership in a lenient label, leading to increased leniency and a broadening of the label's social meaning.

Conclusion
In this article we argue that social categorization is governed by an ecological dynamic in two planes: feature space and label space.When an actor changes its feature values or drops or adds a label, these changes not only affect the focal actor but also contribute to changing the meanings of social categories.This, in turn, can lead to cascades of changes in classification, with entities shifting positions in both spaces in response to movements of others.Such uncoordinated movements might seem to lead to chaos.On the contrary, we suggest that this dynamic can perpetuate a link between labels and sets of feature values, if actors pay attention to the positions of others and use labels accordingly.At the same time, it is possible for categories to become lenient in label space, with high degrees of overlap and vague boundaries.This creates an ecology within label space, with proximities to other lasociological science | www.sociologicalscience.com bels disrupting a clear relationship between sets of feature values and perceived typicality in the category.The result is that leniency weakens the relationship between feature-based similarity and categorical assignments.
To investigate these ideas, we built a formal model relating positions in the two spaces.Our theory implies that an object's proximity to labeled clusters in feature space influences the adding or dropping of label affiliations in label space.Furthermore, it predicts that for lenient labels, the link between feature values and labels weakens.The importance of expressing these ideas in formal language comes through in our analysis.It turns out to be tricky to formally derive these conclusions, which suggests that they might not be as straightforward as would appear if expressed in natural language.Furthermore, our analysis highlights the assumptions necessary to arrive at our conclusions.
Our empirical analysis tests the theory's implications in the software industry.We use data on organizations' patents to represent positions in feature space and affiliations with market labels from press releases to represent positions in label space.The results support the new theoretical propositions.We find evidence that when producers are proximate in knowledge space to a label they affiliate with, they are less likely to drop it.If they lie close in knowledge space to labels they do not affiliate with, they are more likely to drop any label and add the more proximate one.We also find that feature-space proximity to lenient labels has a weaker effect on adding a label, as compared to proximity to constraining labels.
In deciding whether to affiliate with a label, agents compare their own feature values with those that already claim the label.This means that the specific features associated with labels can (and do) change over time.Even so, a link between features and labels can persist.The relationship between features and labels is reinforced simply when agents observe how others are classified and respond in kind.It is informative to note what we do not assume: it is not necessary to have strong mechanisms of boundary enforcement or third parties such as critics and intermediaries to retain a link between features and labels.It is enough for agents to infer typicality from featurebased similarities to a labeled cluster and to label based on typicality.
At the same time, our study shows that increasing leniency weakens the relationship between features and label affiliations, resulting in a partial decoupling between the two spaces for lenient labels.With informal classification, both constraint and leniency get reinforced as actors independently navigate feature space and label space.The result is that classification evolves to contain a mix of categories: some constraining, some lenient.
How labels become lenient is a different matter.Leniency likely emerges when atypical entities frequently are assigned membership in a label and do not face censure.This suggests that a lack of boundary enforcement can foster the emergence of lenient labels in a classification.But importantly, these elements are not necessary to propagate lenient labels once they exist.This means that once a label becomes lenient, it is likely to remain that way, even with an audience that attends to the links between labels and feature values.
This study also indicates that actors compare positions in feature space and label space relative to the positions of other actors in the domain.It is proximity to objects that are affiliated with a label-not necessarily fixed positions in feature space-that drives labeling.In this way, a producer can become more or less proximate to a labeled cluster if the cluster moves in feature space, even if the producer does not move.We believe this captures an interesting facet of social classification: that feature values associated with existing categories can (and do) change, and that changes affect perceptions of the category.
In summary, this article explores dynamics in market classification that arise when producers compare their feature positions to those of competitors and use these comparisons to inform market-label affiliation.Producers evaluate how similar their organizations are to others when deciding whether to adopt or drop a market label.This action does not go unnoticed by others who are similarly comparing feature positions and label affiliations.Thus emerges an ecology of social categories, where positions in feature space and label space change, but a link between them persists, resulting in a dynamic and meaningful system of classification.

Figure 1 :
Figure 1: Example of a feature space with three binary features.

Figure 2 :
Figure 2: Example of an isomorphic embedding of schemas (sets of feature values) and labels.

Figure 3 :
Figure 3: Two illustrative membership functions showing the effect of a difference in proximity on the relationship between distance from a category schema/prototype and the probability of labeling; see text.

Figure 4 :
Figure 4: Sample press release identity statements with category affiliations.

Figure 5 :
Figure 5: Effects of knowledge proximity on the hazard of adding a label, by leniency (from Table 3, model 3).

Figure 6 :
Figure 6: Effects of knowledge proximity on the hazard of dropping a label, by leniency (fromTable 4, model 4).

Figure 7 :
Figure 7: Illustration of the dynamic implications of the theory.

Table 1 :
Descriptive Statistics for Tests Involving the Hazard of Adding a Label.(N = 1, 434, 096)

Table 2 :
Descriptive Statistics for Tests Involving the Hazard of Dropping a Label.(N = 10, 908)

Table 3 :
Effects of Proximity and Leniency on the Hazards of Adding a Label (ML Estimates of Piecewise-Continuous Hazard Models)

Table 4 :
Effects of Proximity and Leniency on the Hazards of Dropping a Label (ML Estimates of Piecewise-Continuous Hazard Models)