Preparing India Cluster Definitions

The first step is to create the format in which the data – employment, wages, GVA is required i.e. preparing India Cluster Definitions sheet based on US Cluster Definitions. The US Cluster Definitions prepared by Delgado, Porter, & Stern (2014) are based on the NAICS codes. So, a mapping is required between the US NAICS codes and Indian NIC codes. The Indian NIC Codes changed in 1998, 2004 and 2008. So, three mapping sheets are prepared.

The following steps are required:

  • Prepare a mapping sheet between 5-digit NIC 2008 and NAICS codes.
  • Prepare India Cluster Definitions sheet based on the US Cluster Definitions by using the above mapping.
  • The time period of the study is 1999-2014 and for getting the data before 2009 mapping of NIC 2008, NIC 2004 and NIC 1998 codes is required.
  • Create a mapping between NIC 2008 and NIC 2004 codes. Update the India Cluster Definitions using 2004 codes.
  • Create a mapping between NIC 2008 and NIC 1998 codes. Update the India Cluster Definitions using 1998 codes.
  • Although the definitions are based on the industry patters in US, there are a number of reasons why these definitions are useful in the Indian context as well. First, the US provides more granular data across all of its regions than what is available for India. An application of the same methodology in India would thus lack the same level of precision that can be achieved in the US. For instance, data for education and knowledge creation industries is missing in India thus making it impossible to define that cluster[1]. Also, for some clusters that are local by nature have data available for only some industries within them. Thus, the resulting local cluster definitions would suffer due to this lower quality of data. Second, patterns of economic geography in the U.S. are more visible than in India. Therefore, U.S definitions will provide a view of how cluster categories should look like.

Extracting data from ASI

The raw ASI data is available in different blocks in ASI. Each block provides the following information:

Block Information
A Identification Particulars

Year, Block, State Code, District Code, Sector, Industry Code, Number of Working Days, Cost of Production, Multiplier etc.

B Owners Details

Type of organisation, Type of ownership, Total number of units, Original value of Investment in P & M (codes), ISO Certification, Year of initial production, Accounting year, Months of operation, Computerised A/C system and availability of data in Computer.

C Fixed Assets Details

Gross Value, Net Value, Depreciation

D Working Capital

Woking Capital opening and closing

E Employment & Labour Cost

Manufacturing man-days worked, Non-Manufacturing man-days worked, average number of persons worked, number of man-days paid for, wages & salaries

F Other Expenses

Purchase value of goods, interest paid, rent paid, total expenses, operating and non-operating expenses, insurance, repair and maintenance

G Other Incomes

Income from services, total receipts, rent received, interest received, subsidies, balance of goods, value of own construction

H Input Items

Quantity consumed, purchase value, rate per unit, item and unit code

I Input Items Imported

Quantity consumed, purchase value, rate per unit, item and unit code

J Products 

Quantity manufactured, quantity sold, gross and net sale value, excise, sales tax, ex-factory value of quantity manufactured

Each block contains the “Identifier” named DSL. This will be used to merge the data across blocks.


 Aggregating the ASI data

The second step is to aggregate the required data in the cluster format created in the above step. We require the data for following variables:

Variable Definition (ASI)
Number of Units The number of units that are working i.e. keep the status of unit to 1.
Production Workers Male and female workers employed directly, and workers employed through contractors
Skilled Workers Supervisory & managerial staff and other employees
Total Employees The sum of production workers, skilled workers and unpaid family members.
Skilled Worker Wages Wages of supervisory & managerial staff and other employees
Production Worker Wages Wages of male and female workers employed directly, and workers employed through contractors
Total Employee Wages Wages of production workers, skilled workers and unpaid family members.
Gross Value Added Total Output – Total Inputs


Total Output = Ex-factory value of quantity manufactured + Total receipts

Total Input = Materials consumed + Fuels consumed + Other input values


  • Collect the number of units, production workers, skilled workers, total employees, production worker wages, skilled worker wages and total wages at the factory level for each state from different blocks after merging them with Block A. They represent the actual values.
  • Use the multiplier value (already provided by ASI) to get the estimated figures for each of the variables.
  • Aggregate this factory level in the cluster format (based on 5-digit NIC industry codes) using the following formula:

Number of Unitscr = ∑ Number of Unitsir


c = cluster

r = region

i = 5-digit industries in the cluster c.

  • Use the same formula for aggregating other variables.

[1] The dataset used is ASI.

Other Cluster Mapping Sites


We're not around right now. But you can send us an email and we'll get back to you, asap.


©2018 Institute for Competitiveness, India

Log in with your credentials

Forgot your details?