What Can Explain the Chinese Patent Explosion ?

We analyse the ‘explosion’ of patent filings by Chinese residents both domestically and in the United States during the early 2000s, employing a unique dataset of 374,000 firms matching patent applications to manufacturing census data. Our analysis reveals that patenting is highly concentrated among a small number of firms, operating in the information and communication technology sector. Although increases in patent filings by these companies are partly driven by increased R&D intensity, our analysis suggests that the explosion of patent filings at the Chinese patent office is driven by factors other than underlying innovative behavior, including government subsidies that encourage patent filings directly. JEL classification: L25, O12


Introduction
China's economic success over the past decades has been widely regarded as the result of its ability to produce manufactured goods at low cost, building on the availability of cheap labour and scale economies, while relying on existing technologies of production. China's ability to upgrade its technology base and move up the value-chain is frequently argued to be hampered by weak (intellectual) property rights enforcement (Zhao, 2006). More recently, however, the notion that China is catching up fast in terms of scientific and technological innovation has gained considerable ground. The number of domestic invention patent filings with the Chinese patent office (SIPO) has increased at an average rate of 32% per annum from around 15,600 to over 700,000 during the period 1999-2013. 1 Utility patent filings by Chinese residents 2 with the U.S. patent office (USPTO) grew at an annual rate of 35% to nearly 15,500 over the same period, albeit from a low base of 271 in 1999. 3 This patent 'explosion' at home and abroad is paired with strengthened statutory intellectual property (IP) rights protection (Park, 2008)  At the same time, there is some evidence to suggest that most of the innovation in China is of merely incremental nature and hence the corresponding patents protect 'small inventive steps' (Puga and Trefler, 2010). While such incremental innovation may still be valuable and in fact account in large part for China's success (Breznitz and Murphree, 2011), the concern is that the recent increase in patent applications is produced overwhelmingly by inventions embodying little technological progress.
Recent empirical evidence suggests that patent subsidies, introduced by local governments in virtually all Chinese provinces from 1999 onwards, have also played an important role in explaining the 'explosive' growth of Chinese patenting (Li, 2012;Dang and Motohashi, 2015). Boeing and Mueller (2015) suggest 1 Data from the World Intellectual Property Organization (WIPO). 2 We use 'Chinese firms' and 'Chinese residents' interchangeably. Our firm-level data covers indigenous firms and subsidiaries of foreign multinationals. U.S. utility patents correspond to invention patents in China. 3 Data taken from various USPTO Performance and Accountability Reports.
that patent quality of PCT filings 4 by Chinese applicants is low by international comparison and that quality has been decreasing over time as the number of filings has increased. They also find some evidence for a negative correlation between patent quality and filing subsidies.
The view that China's patent explosion over the past two decades was driven largely by an increase in the patenting of low quality inventions -fueled by public incentive schemes -stands in stark contrast to earlier findings in the literature, which explained the recent increase in Chinese firms' patenting activity by an influx of FDI, the opening of the economy in particular through China's WTO accession, and a major overhaul of the legal framework in form of amendments of the patent law (Hu and Jefferson, 2009). Despite widespread doubts about the link between innovative prowess and the Chinese patent explosion in the media and in policy circles, 5 there is no quantitative analysis based on representative firm-level data that investigates the determinants of the Chinese patent explosion during its critical years in the early 2000s.
We analyze the recent 'explosion' in the number of patent applications by manufacturing firms registered in China with SIPO as well as the USPTO, which is by far the most important destination for Chinese patent filings abroad (Wunsch-Vincent et al., 2015). In contrast to the study by Hu and Jefferson (2009) our analysis is focused on 'invention' patents which are subject to substantive examination for novelty and inventiveness in both constituencies; this prevents our analysis from being distorted by the vast number of utility models and design patents with low innovative content that do not require substantive examination by the Chinese or U.S. patent offices. Apart from separately analysing the determinants of patenting with SIPO and the USPTO, we infer information on underlying inventions by assessing where companies seek patent protection: only domestically with SIPO or (also) with the USPTO. Not only are the direct and indirect costs associated higher in the U.S., but inventions are required to overcome a higher novelty hurdle in patent examination during our sample period. These differences suggest that a comparison of patents filed with the USPTO and SIPO reveals additional information on the underlying invention and the corresponding patentees.
We construct a representative firm-level dataset that combines invention patent data and company financials. We match SIPO and USPTO patents filed between 1985 and 2006 to around 316,000 manufacturing firms contained in China's Annual Survey of Industrial Enterprises (ASIE) compiled by the 4 Filings under the 'Patent Cooperation Treaty' allow an inventor to simultaneously seek protection in a large number of countries using a single application. 5 In particular The Economist magazine has voiced repeated concerns that 'merely churning out patents does little to advance innovation' (Dec 13th 2014; see also Oct 14th 2010).
National Bureau of Statistics of China (NBS) for the period 1999-2006. 6 The period covered represents perhaps the most interesting period in state innovation and IP policy as well as firm innovation activity in China: it encompasses aggressive opening up to FDI, policy commitments related to WTO Mathews, 2008; Hu, 2010) or self-reported patenting without distinction between low-sophistication design or utility and more substantive invention patents (Hu and Jefferson, 2009). Comparing the descriptive statistics for patenting with non-patenting firms, and for those firms patenting in the U.S. with those exclusively patenting in China, reveals a large number of significant differences to motivate our empirical analysis.
We rely on the patent production function approach (Pakes and Griliches, 1980;Hall and Ziedonis, 2001) to explain the patenting decision and number of patent filings by Chinese companies with SIPO and the USPTO, respectively. Apart from the standard predictors of patenting, such as R&D expenditure, firm size, and age, we are particularly interested in the importance of a firm's exporting behavior, financial constraints, as well as province-level patent subsidies in predicting patenting behaviour. There Melitz and Trefler, 2012), which suggests that exporting in turn should predict patenting provided the patents reflect underlying innovations. Similarly, financial variables are key determinants of corporate 6 Our regressions also include firms which are not part of our Qin/Oriana bridge dataset (see Section 2): we empirically account for selection from the larger ASIE (374,000 firms) into the integrated ASIE-Qin/Oriana (316,000 firms) dataset. innovation activities (Brown et al., 2009(Brown et al., , 2012Guariglia and Liu, 2014) and may help identify structural differences between types of firms based on where they chose to safeguard their IP rights. Finally, with specific reference to China there is recent evidence which suggests that state subsidies are an important element in explaining patent filings of Chinese firms (Li, 2012;Dang and Motohashi, 2015;Lei et al., 2015) and we add information on provincial patent filing subsidy schemes to our patent production functions.
Our findings confirm that patent filings with SIPO are in part driven by state incentive schemes, and we further document a negative correlation between export intensity and domestic patenting. In contrast, for USPTO patentees resident in China the incentive variable is insignificant and export intensity is positively correlated with foreign patenting. Those companies in China filing with the USPTO are substantially larger in terms of number of workers than those only filing domestically.
Financial constraints play an important role in innovation behaviour but do not appear to be a source of differential firm behaviour eliciting qualitative differences. Our findings thus suggest that domestic patenting in China, on average, is driven by state incentives and distinct 'types' of firms (in terms of size and export intensity) compared with those firms patenting overseas with the USPTO.
Our analysis contributes to the literature on innovation and economic development (Nordhaus, 1969;Penrose, 1973) by exploring the drivers behind a dramatic shift in the number of patent filings in China. Our results illustrate that large increases in domestic patenting activity per se cannot be seen as indicative of associated changes in innovative behavior in a developing country context. The strong concentration of patenting in ICT that we find in China on the one hand, and the impact of public incentive programs as well as the inverse export-patenting relationship on the other, further caution that a broader technological take-off is not (yet) occurring. That said, other successful Asian economies have seen similar concentrations in patenting activity, in particular during the early take-off phases. 7 The remainder of this paper is organized as follows. Section 2 discusses the construction of our dataset. Section 3 explains our empirical strategy. Sections 4 and 5 discuss some descriptive evidence and our analytical results. Section 6 offers some brief concluding thoughts. 7 Mahmood and Singh (2003) point to a strong concentration of USPTO patents  among assignees in South Korea and Singapore as the top 50 assignees hold 85% and 70% of each country's USPTO patents, respectively.

Firm-level Data
Our firm-level data come from China's Annual Survey of Industrial Enterprises (ASIE) compiled by the National Bureau of Statistics of China (NBS). ASIE includes the whole population of state-owned firms as well as all non-state-owned companies with annual sales above CNY5 million (around US$600,000).

Patent Data
The patent data come from the European Patent Office's PATSTAT database (version 10/2010). We extract patents filed by Chinese residents (this includes indigenous and foreign(-invested) firms). Our analysis focuses on the application date of a patent. We obtain information on the grant status of patent filings from a 2014 version of PATSTAT to account for a grant lag of several years.

Matching/Bridge
Due to the absence of a unique identifier shared by the firm-level and patent data, the main data problem consists in matching patents to firms. This is generally challenging for a number of reasons (Helmers et al, 2011); in the case of China, matching is even more difficult due to the different ways in which firm names can be recorded: using (a) Chinese characters, (b) pinyin transcription, (c) a translation of the Chinese names into English, and (d) any mix of (a)-(c).
The Chinese census data contain only firm names using Chinese characters (a), whereas PATSTAT contains (b), (c) and (d). In principle, to match patents to firms we would have to either transcribe firms' names contained in ASIE or the assignee names in PATSTAT. Instead we identified an alternative solution: the Qin and Oriana databases provided by Bureau Van Dijk offer firm-level balance sheet data for individual firms in the Asia-Pacific region. The combination of Qin/Oriana contain data for about 451,000 Chinese firms for 2001-2009. The advantage of using Qin/Oriana is that these report firm names using the Latin alphabet as well as the ASIE unique firm identifier. This allows us to link Qin/Oriana to ASIE through the unique identifier and to use Qin/Oriana firm names to match with assignee names contained in PATSTAT. While this approach allows us to match patent data to Chinese firms, it also has some limitations, which together with suggested remedies are discussed in an online appendix.
Our integrated dataset matching ASIE to Qin/Oriana covers 316,000 firms, while the full ASIE sample for 2001-6 contains 374,000 firms (average T i = 2.3). Tables A-1 and A-2 in the online appendix contain information and descriptive statistics on the sample of firms used in our regression analysis.

Empirical Strategy
Our objective is to analyze the drivers behind the explosion in patent filings in China. The existing evidence is ambivalent about the factors that have contributed to the rapid rise in patent filings. On the one hand, Hu and Jefferson (2009) suggest that patenting in China is explained by increases in FDI, China's WTO accession, and improvements in the legal framework and enforceability of IP, with the latter two empirically captured by time dummies. On the other, there is a widely-held view that SIPO rubber-stamps patent filings which protect at best low-value, incremental inventions (Puga and Trefler, 2010), and that filings are largely driven by government incentives which target patenting directly (Li,  To explore the determinants of patenting in China we use the patent production function approach (Pakes and Griliches, 1980;Hall and Ziedonis, 2001) that relates a firm's patent filings to a standard set of variables, such as R&D expenditure, firm size and age. In light of the export-innovation literature, Our main interest is in our ability to predict patent filings with SIPO by companies resident in China using the patent production function approach, which allows us to analyze the determinants of the Chinese patent explosion. To provide a benchmark against which to compare our results on the predictors of patent filings with SIPO, we use the same production function to predict patent filings by Chinese residents with the USPTO: since patent filings in the U.S. are subject to a different standard than filings with SIPO (for a detailed discussion see online appendix D), comparing the determinants of USPTO and SIPO patent filings by the same set of companies in China offers additional insights on the determinants of the patent explosion in China. More specifically, controlling for all standard determinants of firm-level innovation and patenting including a set of variables capturing financial constraints, if patenting with SIPO is driven by factors other than innovation, we expect in particular export intensity to predict filings only at the USPTO but not SIPO. In contrast, due to the policy drive to promote domestic patenting directly, we expect patent subsidies to predict filings only with SIPO but not the USPTO.
We test these hypotheses through a number of alternative empirical models which are all variations of the Pakes and Griliches (1980) patent production approach. We begin with the patenting decision, where we disregard the patent count and focus merely on the prevalence of patenting. We employ binary choice models to analyse two dichotomous outcomes, namely patenting with SIPO and patenting with the USPTO, in a standard random utility formulation (Greene and Hensher, 2010).
We address selection into our integrated dataset, a subsample of ASIE, as part of our analysis of the patenting decision by modelling selection and patenting jointly: in bivariate probit models for USPTO and SIPO patenting, respectively (results available on request), and then in trivariate probit models jointly estimating selection, patenting with the USPTO, and patenting with SIPO. 9 The formal representation of the trivariate probit model is where Φ(·, Σ) is a multivariate normal distribution, 1 {·} represents binary variables ('sipo' and 'uspto' 9 Addressing selection in these nonlinear models does not require an exclusion restriction from the selection equation: identification is in principle given through functional form (Greene and Hensher, 2010).
for at least one patent application with SIPO and USPTO, respectively; 'ss' is the sample selection equation), and d j p and d j t are province (see below) and time fixed effects. We enter five groups of covariates to analyse the association of patenting with firm-level innovation effort (INNOV), export behaviour (EX) and financial constraints (FIN), as well as government patenting incentives (INCENT), on top of additional control variables (X) related to firm size, age, and ownership type. 10 In an additional specification we account for unobserved heterogeneity potentially distorting our results by including provincial dummies in the trivariate probit models. The results from this exercise (available on request) are qualitatively in line with those presented here. 11 In order to gauge the reliability of our results in the face of potential endogeneity of our regressors, we estimate instrumental variable (IV) probit models adopting first or first and second lags of all variables (except firm age, ownership, and time dummies) as instruments. 12 A second set of regressions then analyses the number of patent applications and grants by estimating nonlinear functions which relate the patent count to firm characteristics, using the same sets of covariates as above. We treat our panel as repeated cross-sections (see Bound et al., 1984), in the spirit of Hilbe, 2011). The Negative Binomial estimator enables us to introduce a separate dispersion parameter κ to overcome this issue: 13 the formal model representation of these estimators is with y it the patent count and λ it = exp(Z it ϕ), where for convenience of notation we have expressed the five sets of covariates and dummies detailed above with matrix Z and their respective coefficient vectors with ϕ.
10 Full details of all variables and controls included in the models are contained in the online appendix. 11 It is well-known that the inclusion of a large number of fixed effects in nonlinear models creates serious bias due to the incidental parameter problem. This problem should not create any difficulties for a mere 30 province dummies, however China's vast economic heterogeneity creates a separate problem here in that nine (two) provinces have no firms with any patent applications with USPTO (SIPO) over the 4-year sample period, which means that firms from these provinces are dropped.
12 Additional analysis (results available on request) replaces the dependent variable of at least one patent application with that of at least one granted patent, which can act as a basic proxy for the quality of innovations -results are qualitatively identical.
13 Tests for the statistical significance of κ reject the Poisson estimator in favour of the NegBin alternative in all cases.
We also present results from a fixed effects ( There are substantial differences in subsidy programmes across provinces (Li, 2012) and many cities also offer their own patent subsidies (Lei et al., 2015); some programmes offer filing or examination subsidies, others pay out a cash reward only after successful grant. Some provincial and city governments fully reimburse filing and examination fees, others only reimburse a fraction of the fees. Others even determine subsidy amounts on a case-by-case basis. We use data collected by Dang and Motohashi (2015) on the presence and strength of provincial-level incentives targeting patenting directly, where our focus is on filing subsidies. This data substantially extends the information on subsidy schemes used in an earlier study by Li (2012) as it differentiates subsidy schemes between those that provide 14 We add dummies for firms with zero R&D expenditure (87% of observations). 15 We define liquidity as the difference between a firm's current liquid assets and liabilities, normalised by total assets; and leverage as the ratio of total liabilities to total assets. Because R&D is treated as a current expense for accounting purposes we add R&D expenses to the standard measure of net cash flow (after-tax earnings plus depreciation) to obtain gross cash flow (see Himmelberg and Petersen, 1994); this cash flow variable is then normalised by total assets. full or partial reimbursement of fees. It represents the most comprehensive available dataset on patent subsidies in China. Other studies on the effect of patent subsidies have used more limited data, Lei et al. (2015) for example use data for six cities in the Jiangsu Province and Boeing and Mueller (2015) only rely on a year dummy variable to capture the introduction of a subsidy programme. Further details about the data used in our analysis and the evolution of patent subsidies across provinces over time are provided in the online appendix.
Our choice of additional firm-level controls is guided by standard suggestions in the literature, namely measures for size and age (both in logs), as well as characteristics with particular relevance for China, namely ownership type and province dummies (the latter as a robustness exercise, results available on request). Firm size is measured by employment and meant to capture possible economies of scale in patent production. In an OECD country context firm age is intended to capture the experience of older firms in the management of the patent application process (Hall and Ziedonis, 2001), however in a China emerging from a planned economy, this is an additional indicator for socialist period legacy.
Ownership (our designation is based on paid-in capital share in excess of 50%, following Guariglia et al.,

2011) includes two types of foreign-invested enterprises (FIEs) distinguishing those from Hong Kong,
Macao and Taiwan (HMT) and elsewhere (other). We further distinguish Private, State-Owned (SOEs), Collective and Other Chinese firm types. We prefer to investigate the 'direct' effect of foreign direct investment (FDI) on patenting behaviour rather than relying on proxies suggested in the literature to capture 'knowledge spillovers' from FDI (Hu and Jefferson, 2009). We add year dummies to all models which allows us to chart the changes in patenting over time. All standard errors reported are clustered at the firm-level.

Descriptive Evidence
Our integrated dataset enables us to produce a number of powerful insights into Chinese patenting through simple descriptive statistics. Tables 1 and 2 list the top-10 companies patenting with the USPTO and SIPO, respectively. These tables are constructed using the patent data for the entire time horizon 1985 to 2006 for the firms in our integrated dataset. Table 1 illustrates the concentration of USPTO patents among a small number of companies: the top-10 assignees account for slightly less than 75% of USPTO patents. Interestingly, three companies, Hongfujin (1), Fuzhun (3) and Futaihong (6), are subsidiaries of the Taiwanese-owned multinational Foxconn Technology Group, the world's largest contract manufacturer of 3C (Computer, Communication, Consumer electronics) products. These three subsidiaries account for 35% of total USPTO patent filings in our matched dataset, adding in communications giant Huawei brings the tally to over 50%.
As shown in the last column of Table 1, with the exception of Sinopec, Nuctech, and BYD, all top-10 USPTO patentees are in 3C industries. Table 2 shows SIPO patent filings, with the top-10 companies accounting for over half of all patents. Here the dominant player is Huawei, filing nearly a quarter of SIPO patents, whereas only one Foxconn subsidiary, Hongfujin, is among the top-10. Again, with the exception of Sinopec, BYD and Baoshan, all companies listed in Table 2 are in 3C industries. Note that there is a significant overlap of companies in Tables 1 and 2: six companies appear in both lists, with four of these in 3C industries.
Apart from asking who patents, the question of what is patented is equally important. We classify USPTO and SIPO patents according to the type of innovation they protect: product or process innovation or a combination of the two. There is a common perception in the literature that patents protecting product inventions reflect genuine innovations whereas process patents are of less innovative content as they only indicate new ways of producing some output by existing means. We read random subsamples of 1,900 USPTO and 980 SIPO patents. 16 Table A-3 in the online appendix shows a breakdown of patents filed by Chinese residents according to the innovation type they protect. For USPTO patents nearly half cover product innovations and only 20% process innovations. The pattern looks different in the case of SIPO patents: merely 30% protect product innovations and 37% process innovations.
This analysis suggests that inventions that are patented in China but not in the U.S. are more likely to protect process innovations. In contrast, results for USPTO patents indicate that the share of patents protecting product innovations is substantially higher.
Although there is clear evidence for substantial concentration of patenting among a small number of firms with either jurisdiction, we can also distinguish the observable characteristics between firms which (a) do and do not patent, and in turn between those which (b) patent with SIPO and the USPTO. Table 3 provides the respective unconditional mean comparison with associated one-sided t-tests. The columns on the left compare characteristics for patentees with non-patenting firms, highlighting the correlation between patenting and innovation effort (R&D expenditure). While export intensity is qualitatively similar, non-patenting firms have a higher propensity to be non-exporters or pure exporters -both of the latter findings ring true with reference to work on productivity and exporting (Melitz 16 In the case of SIPO patents claims must be retrieved from the original patent documents which are only available in Chinese. and Trefler, 2012; Defever and Riano, 2013). Patenting firms are larger, older and have higher liquidity than non-patenting firms, while state incentives to patent are higher in provinces where patenting firms are located. These results echo the findings of Guariglia and Liu (2014) who use new product sales as an indicator of innovation. Our simple analysis of means also finds significant differences between firms patenting (a) with the USPTO or (b) (only) with SIPO (in the columns on the right of the table): among the characteristics which distinguish USPTO patentees from those firms which patent only domestically, the higher R&D expenditure, export-to-sales ratio, and firm size are particularly noteworthy. A number of characteristics are also surprisingly unimportant in this comparison, notably the financial variables (except for cash flow) and the provincial-level subsidies for patent applications.

Patenting decision
We begin our discussion with the empirical results for the (binary) patenting decision. Table 4 reports results for the 4-year sample for which R&D expenditure is observed. 17 In all cases the data for the ASIE sample (ASIE-Qin/Oriana match and ASIE-only firms) is used and near the top of each table panel we indicate whether we account for selection into the integrated ASIE-Qin/Oriana sample.
Columns [1] and [2] represent simple probit models for the patenting decision with SIPO or USPTO, while in column [3] we add a sample selection equation for ASIE-Qin/Oriana firms which is estimated jointly with the two patenting decision equations (results for bivariate probit estimating selection and SIPO or USPTO patenting jointly yield qualitatively very similar results and are available on request).
The trivariate probit results suggest that our matched-sample regression does not suffer significant selection bias and that estimating patenting equations for SIPO and USPTO separately only affects estimation and inference at the margin. The remaining columns then attempt to counter concerns over endogeneity by instrumenting with the first lag and first and second lags in columns [4]- [6] (in column [6] we additionally instrument R&D expenditure using first lags). In the absence of obvious external instruments, these specifications provide some indication of the robustness of our main findings in column [3] to endogeneity concerns. Note however that diagnostic tests yield diverging results in the SIPO and USPTO models which suggest that for the SIPO equation our instrumentation strategy violates the exclusion restriction and should therefore not be interpreted as causal. 17 Appendix Table F-1 shows the linear probability results.
Conditioning on pure and non-exporters, our various SIPO models suggest a significant negative relationship between export intensity and patenting behaviour, which is in stark contrast to the findings in the existing literature. There is further a significant positive relationship with government incentives to file patents and the decision to apply for a SIPO patent. These models further provide evidence for a significant positive effect of innovation effort on the patenting decision, while firm size, foreign or private ownership, and financial constraints are also significant and have the expected signs. 18 On the whole the SIPO results indicate that the patenting decision is (partly) driven by government incentives, supporting the findings of Li (2012), Dang and Motohashi (2015) and Lei et al. (2015), and further that more export-intensive firms, contrary to a Melitz-type prediction of the exporting-productivity relationship, have a lower propensity to patent than their peers exporting lower shares of their output.
Turning to the USPTO models many of our results are statistically insignificant, likely due to the limited number of patentees. Nevertheless we find a significant and strong relationship between the patenting decision and innovation effort, firms size, some measures of financial constraints as well as export intensity, respectively. The coefficients on government incentives are uniformly low and statistically insignificant. Coefficients on export intensity are positive and large but not uniformly statistically significant across all models.
We further highlight those covariates for which there is a statistically significant difference for coefficients between the SIPO and USPTO equations: most strikingly, the export-innovation nexus is positive (though not necessarily statistically significant) and thus in line with the literature for USPTO equations, while filing subsidies are now even negative (in our IV models), albeit statistically insignificant.
Results for indicators of financial constraints show similar deviation between SIPO and USPTO patentees, though only in the IV specification with the smallest sample size (column [5]), which is also the specification where results for patent subsidies deviate statistically significantly. We obtain qualitatively similar results when including a set of 2-digit SIC industry dummies to confirm that despite the dominance of the ITC sector our results are not driven by sector of operation. 19 What are the quantitative implications of the differences detected between SIPO and USPTO patentees? 18 The coefficient on the cash-flow variable deviates from the existing literature on China (e.g. Guariglia and Liu, 2014) in that firms do not appear credit-constrained. Our analysis investigates patents (for SIPO: 0.39% of observations are non-zero) as opposed to (self-reported) new product sales (10.26% of observations are non-zero) in these authors' work. Hence, differences in results may be due to that fact that patented inventions commonly represent only a subset of firms' product innovations where financial constraints are potentially less relevant. 19 We prefer the results without industry fixed effects since inclusion of sectoral dummies reduces the sample size in the USPTO regression by around 25%: there are no USPTO patent filings in six sectors (Leather and fur; Furniture; Paper; Printing; Rubber and Transport Equipment) which implies that there is no variation in the dependent variable for observations in these sectors and they are thus automatically dropped from the sample. There is further non-convergence in the trivariate probit model if we introduce industry dummies. Full results are available on request. Table 5 shows the marginal effects for the coefficients shown in Table 4. For the continuous variables we focus on a hypothetical shift of a firm from the 75th to the 85th percentile of the distribution, which in the case of export intensities equates to values of 12% and 76%, respectively. The marginal effects for export intensity in the SIPO equations range between -0.1% in the probit and -0.3% in the IV probit specifications, while they are between 0.01% and 0.07% in the USPTO equations: these figures are modest in absolute terms, although we highlight the generally low propensities to patent at the top of the table. In addition, as we indicate in the columns marked 'Ratio', the export-intensity 'effect' is a multiple of the marginal effects of other firm characteristics such as firm age and size or financial constraints (note that only results for the continuous variables are directly comparable).

Patent count analysis
We now turn to the empirical analysis of patent production, which we investigate using count regression models. We present results from three different models with distinct setup and interpretation: first, we analyse a Negative Binomial for counts of patent applications with SIPO and the USPTO in columns [1] and [2] of Table 6, respectively. These were found to be favoured over standard Poisson regressions based on a direct statistical comparison (LR test). These estimates provide insights into whether firm characteristics are associated with differential numbers of patent applications between the two jurisdictions. Second, we analyse fixed effects Poisson models in columns [3] and [4], which limit the sample to 'innovating' firms with at least one SIPO or USPTO patent application over the 4-year time horizon. The interpretation of these models is whether any changes in R&D, export behaviour, financial variables, etc. within patenting firms over time are associated with higher or lower patent counts; since many unobserved determinants of patenting are plausibly captured by the firm fixed effects this gets us closer to a causal interpretation of the results than the previous count data models -note however that the average number of observations per firm in these FE Poisson models is merely 3.1, thus offering precious little time series variation to identify precisely any within-firm effects. Third, we move from counts of patent applications to those of granted patents in the analysis in columns [5] and [6]. The patent filings-to-grant-ratio for a firm can be interpreted as a first indication of the quality of its patent filings. We find that only around 63% of SIPO filings are eventually granted whereas 83% of USPTO filings are, which motivates the analysis in columns [5] and [6].
The patent count models in columns [1] and [2] show similar patterns in terms of sign and statistical significance between SIPO and USPTO patent counts as were detected in the binary choice models of the patenting decision. Innovation effort is positively associated with higher patent counts in line with earlier findings by Hu  The reported coefficients, ceteris paribus, are differences in the logs of predicted counts for unit increases in the regressors. We also obtained incident rate ratios (IRR), which compute the relative increase or decrease (coefficients in excess of/below 1, respectively) in patent counts in response to a unit change in the regressor (reported in Table 7) -for size and age this unit change implies a doubling of the variable due to logarithmic transformation. In the models in columns [1] and [2] the relative IRR for export intensity yields a twelve-fold difference between SIPO (patent count reduced to 40%) and USPTO (patent count more than quadruples), 22 that for firm size an almost three-fold difference (SIPO count doubles, USPTO count quintuples). For firm age a log unit increase sees SIPO patent count drop to 85% of the previous level, and USPTO counts to 55%, a one-and-a-half-fold difference. Similar 21 Note that the interpretation of the firm ownership dummies is very different in these panel FE models: these estimates now indicate the impact of a change in ownership, and with the results driven by a small number of observations we do not report these estimates to avoid confusion. 22 A 'unit increase' for a variable defined as a ratio between 0 and 1 is clearly difficult to interpret. For convenient interpretation we re-estimated this model using the logarithm of export intensity instead of the level, where a unit increase implies a doubling of the ratio. The IRR for SIPO applications is then 0.95, that for USPTO 1.64, with a (statistically significant) 1.7-fold difference between the two. The IRRs for size and age are virtually unchanged.
figures are obtained if we carry out this exercise for the models using granted patents in columns [5] and [6]. 23

Conclusion
What is behind the recent Chinese patent explosion? Is China transitioning rapidly from imitating technology to producing genuine innovation? What impact does the patent explosion have on the Chinese economy and on the rest of the world? While answers to these questions are of immediate concern to policy makers in China and beyond, their empirical investigation has to date been severely hampered by data limitations: there were no data available for Chinese firms that included companies' actual patent filings or that could distinguish between invention patents and the less innovative utility and design patents. We overcome these constraints and construct a dataset that contains domestic invention and U.S. utility patent filings by 316,000 manufacturing firms registered in China. We employ the data to chart the developments from 1985-2006 and to investigate the factors associated with the Chinese patent explosion over 2001-6, accounting for concerns over selection into our regression sample from survey data representative of large and medium-sized enterprises in China.
Our answer to what lies behind the Chinese patent explosion is unambiguous: a handful of companies account for the overwhelming share of patents. Does this imply there is evidence for wider technological take-off among Chinese companies? Our analysis suggests most likely not: patenting is concentrated in very few industries and even within these is undertaken by very few albeit highly active companies.
What is more, the most patent-active companies both with the USPTO and SIPO operate in the ICT sector, an industry that has become notorious for its patent battles, technological standards (including standard-essential patents), and patent pools requiring firms to arm themselves with sufficiently large patent portfolios.
Our results also point to clear differences in the determinants for the patenting decision as well as patent counts between SIPO and USPTO patentees. While the latter are positively associated with export intensity as suggested by the existing literature on export behavior and innovation, we find SIPO filings to be negatively associated. This suggests that patenting with the Chinese patent office may be to a large extent driven by factors other than underlying innovative behavior: firms patenting with SIPO are found to be responding to state incentives in the form of patent subsidies. This underscores 23 All magnitudes quoted are identical, with the exception of export intensity, where the difference is now seven-fold: the SIPO count reduces to 60%, the USPTO count increases by a factor of 4.5.
the importance of incentives put in place by local governments to promote patenting directly.      Notes: We carry out separate two-sample t-tests in order to compare various firm-level and regional characteristics for [1] non-patenting vs patenting firms, and [2] firms patenting with SIPO vs those patenting with USPTO. The p-value indicates the probability value for a one-sided test. We test each relationship assuming equal or unequal variances across samples (though the means reported are for the former only), hence we obtain two sets of t-statistics and corresponding p values: the test statistics in the first (second) row for each variable assume equal (unequal) variances. For illustration t-statistics in bold indicate statistical significance at the 5% level.  Notes: The dependent variable in all models is a dummy equal to one for a firm patenting with SIPO (USPTO) in year t and zero otherwise. All models employ the 4-year dataset with observed R&D expenditure, with the top and bottom 1% of observations winsorized for R&D expenditure, firm size, export/sales ratio, firm age, as well as the three financial variables. Model [3] accounts for sample selection by ways of modelling inclusion in the integrated ASIE-Qin/Oriana sample jointly with the patenting decision (using multivariate probit). All models [4] to [6] use lagged values (either 1st lag or 1st and 2nd lag as indicated in the column header) of the three export variables, firm size, the three financial constraints variables and the patent subsidy variable as instruments. Model [6] in addition uses the first lag of the R&D expenditure variable as instrument (similarly for the squared term). All variables and 'Further Controls' are detailed in Table A-4 in the online appendix. Statistically significant coefficients and their standard errors appear in bold. We further highlight the covariates for which there is a statistically significant difference between the coefficients in the SIPO and USPTO equations. Standard errors are clustered at the firm level. .367%

Continous variables
Export/Sales -.093% .011% -.093% .013% .002% -.333% .014% -.256% .066% -.286% .014% ln(Workers) .045% -2. Notes: We present the partial effects and related statistics for the bivariate models in Table 4. Note that in the case of the continuous variables these are not average partial effects for the mean or partial effects at the average, but all provide the average partial effect on the propensity to patent of moving from the 75th percentile to the 85th percentile of the distribution of the variable in question. We adopt this strategy due to the nature of the export-sales ratio variable, which is zero for the median firm . In the lower panel of the table we present the effect of a discrete change for the binary variables indicated from the base level (0). Next to the partial effects we report the ratio of the partial effect of export intensity (export-sales ratio) relative to that of each other variable; for instance, a value of -23.2 for 'Liquidity' in column [1] indicates that the partial effects of 'Liquidity' and 'Export/Sales' have opposite signs and that in terms of magnitudes the latter is 23.2 times as large as the former. We provide these ratios for the dummy variable discrete marginal changes as well, even though these are strictly not comparable.  Table A-4 in the online appendix. IRR reports the incidence rate ratios -see text for details. Statistically significant coefficients (10% level) and standard errors are printed in bold. In Models [3] and [4] we omit reporting coefficients for the ownership dummies since these now indicate the patent productivity of firms switching ownership, which is misleading in the general setup of our analysis. Standard errors are clustered at the firm level. Notes: In this table we report the obtained incident rate ratios (IRRs) for the count data models in Table 6. These represent the relative increase or decrease (coefficients in excess of/below 1, respectively) in patent counts in response to a unit change in the regressor -for size and age this unit change implies a doubling of the variable due to logarithmic transformation. The columns marked ratio report the relative IRR between USPTO and SIPO equations: for instance, export intensity yields a twelve fold difference in the IRR between USPTO (patent count quadruples) and SIPO (patent count reduced to 30%). Statistical tests indicate that the IRRs between SIPO and USPTO differ for the export intensity and firm size variables in both negative binomial models of patent applications and patent grants.     24 We apply a definition that assigns patents into the same equivalent group if patents share the same priority documents.

A Data Construction and Descriptive Statistics
v

C Data Cleaning
The merged Qin/Oriana-ASIE sample contains 1,307,118 firm-year observations from 358,032 individual firms spanning the period of 1999-2006 (see Table C Table A-4 and discussed in detail in Section 3 in the main text. Note that for our descriptive analysis of patenting in Section 4, we make use of the entire data span for which we have patent data which covers the period 1985 to 2006. suggesting that obtaining and maintaining patent protection in the U.S. is considerably more expensive in the U.S. than China.
We exploit these differences in cost as well as novelty threshold between the USPTO and SIPO to infer the type, degree of innovativeness, and potential value of the inventions created and patented by Chinese companies. During our sample period, patent filings by Chinese entities with the USPTO had to jump a higher novelty threshold than with SIPO and given the higher associated costs, we expect to see only the most valuable inventions -both from a technological and strategic management point of view -to be patented with the USPTO. Hence, we can learn about the type and quality of SIPO patents by comparing them with the USPTO patents held by firms registered in China. Our integrated dataset allows us to look not only at the characteristics of the inventions underlying USPTO and SIPO patents, but also at the characteristics of the firms that hold these patents. This enables us, not only to look at patent distributions across industries, but also within industries across firms.
In China, a patent application costs CNY 900 (at the time around US$ 110), there is an additional examination fee of CNY 2,500 (US$ 300) and maintenance fees of CNY 300 (US$ 35) every five years.
At the USPTO the basic application fee is US$ 330 and examination fees amount to US$ 220. At the USPTO, renewal fees are not payable annually: at 3.5 years, the maintenance fees due amount to US$ 980, at 7.5 years to US$ 2,480 and at 11.5 years to US$ 4,110. Additional costs for Chinese firms arise from the need to translate the application into English. If a Chinese applicant employs the services of a U.S. patent attorney, although not formally required by the USPTO, substantial additional costs arise. Hence, the numbers suggest that obtaining and maintaining patent protection in the U.S. is considerably more expensive than in China.

E State incentives for patenting
We adopt the data for patent subsidy programs reported in Dang and Motohashi (2015), 29 which were collected by these authors from official government documents, news reports and telephone interviews with local officials. Dang and Motohashi (2015) devise a points system whereby subsidies related to the 'filing' (application) for a patent carries a value (i) equal to 1 if it is fully subsidised; (ii) equal to 0.5 if there is partial subsidy; and (iii) equal to 0 if there is no subsidy. The overview of the provincial incentive scheme is provided in Table E-1, while Table E subsidy schemes -in the right panel we compute the share of points in total potential points across all provinces.
x  (2015): filing subsidies are equal to 1 if the filing or examination fee is fully subsidized in the province where the applicant is located in year t, 0.5 if partly, 0 if not. In the left panel we tally up these points, in the right column we indicate the share of full subsidies implemented across provinces (i.e. not merely the share of provinces which adopted any subsidy scheme). We highlight the four sample years for our regression analysis in bold. Notes: The table presents results from standard and IV linear probability models (LPM). The specifications in [1] and [2] correspond to those in the same columns in Table 4 in the main text, while those in [3]- [5] above correspond to columns [4]- [6]. Statistically significant coefficients and their standard errors appear in bold. We further highlight the covariates for which there is a statistically significant difference between the coefficients in the SIPO and USPTO equations. Standard errors are clustered at the firm level.