Save

Who buys food products from online influencers? predictions with machine learning

in International Food and Agribusiness Management Review
Autor:innen:
Xiaoping Zhong Professor, China Cooperative Economics Research Institute, Anhui Agricultural University 230036 Hefei P.R. China

Search for other papers by Xiaoping Zhong in
Current site
Google Scholar
PubMed
Close
und
Xiaohua Yu Professor, Department of Agricultural Economics and Rural Development, University of Göttingen MZG 10.120, Platz der Göttinger Sieben 5, 37073 Göttingen Germany

Search for other papers by Xiaohua Yu in
Current site
Google Scholar
PubMed
Close

Abstract

The burgeoning growth of rural e-commerce in China is poised to fundamentally transform the rural economic and social landscape. Within this evolving ecosystem, agricultural online influencers have emerged as pivotal actors, yet their specific roles and underlying mechanisms warrant further investigation. Based on online survey data, this study utilized four machine learning techniques to predict consumer behavior in purchasing agricultural products recommended by online influencers and identified key predictors. We found that the random forest algorithm incorporating a comprehensive set of economic and social predictors, achieves a best prediction accuracy of 85.12%. The random forest algorithm and LASSO regression together identified the key predictors. Notably, purchase intention and whether to follow the online influencer on social media emerged as the most significant features enhancing prediction accuracy. Furthermore, our analysis highlights the importance of demographic factors (age), perceived value assessments (emotional and social value), engagement levels (attention to online influencers), and prior e-commerce experiences as crucial mechanisms. Our findings are of practical values for online influencers and related enterprises to improve marketing strategies, promoting branding and commercialization, and for the development of rural e-commerce in China and related professionals’ cultivation.

1. Introduction

The advancement of information and communication technology (ICT), coupled with the wide adoption of the Internet and the proliferation of social platforms, has collectively catalyzed the exponential growth of e-commerce. This digital transformation has fundamentally reshaped global economic and social relations, structures, with substantial evidence suggesting its significant contributions to economic growth, inclusive development and public welfare enhancement (Karine, 2021; Liu et al., 2021; Zhou et al., 2021). The transformative effects of e-commerce are particularly pronounced in rural areas worldwide (Kshetri, 2018; Zhang et al., 2022), as rural communities around the world face widespread problems of decay (Liu and Li, 2017). The application of rural e-commerce increases the rural households’ income, narrows the rural-urban gap, contributes to the adjustment of industrial structure, and changes the rural economic form (Couture et al., 2021; Zhu et al., 2016). Rural e-commerce also provides more employment and entrepreneurship opportunities for farmers, and promotes the returning waves of migrant labors to the countryside, which has an impact on the urbanization route (Qi et al., 2019; Yu and Cui, 2019). Of particular significance is the pivotal role of e-commerce development in poverty alleviation, achieved through transaction cost reduction and the facilitation of direct producer-consumer linkages, thereby enabling market access and trade opportunities for smallholder farmers and geographically isolated communities (Atasoy, 2013; Cui et al., 2017; Peng et al., 2021; Zapata et al., 2013).

The fruits of rural e-commerce development have been proven in major developing economies, notably China and India (see e.g. Li et al. (2019) and Lele &Goswami (2017)). Developing economies such as Brazil and South Africa and international development organizations such as the World Bank are keeping a close eye on the development of rural e-commerce (Karine, 2021; World Bank Group, 2016). The emergence and development of rural e-commerce provides developing countries with a strategic tool for addressing complex socioeconomic challenges and a promising means to improve living standards among low-income populations.

Entering the 2010s, the Chinese government has implemented a comprehensive policy framework to foster rural e-commerce development, encompassing standardization, brand development, cultivation of specialized industries, and agricultural product marketing. This strategic initiative has yielded substantial growth, with China’s online retail market expanding from 0.78 trillion CNY ($120.7 billion) in 2011 to 13.79 trillion CNY ($2,050.2 billion) in 2020, while rural online retail sales surged from 0.35 trillion CNY ($57 billion) in 2015 to 2.17 trillion CNY ($322.6 billion). The development of rural e-commerce has significantly enhanced rural households’ income, generated employment and entrepreneurship opportunities, and transformed the socioeconomic landscape of in rural China (Li et al., 2021; Qi et al., 2019; Tang and Zhu, 2020; Zhang et al., 2022).

The flourishing of rural e-commerce sector is sustained by a series of actors, including digital platforms, individual participants, governments, MCNs, suppliers and consumers (Li et al., 2019; Yu &Cui, 2019). These stakeholders constitute an ecosystem through interaction, cooperation and interest relationship. However, this ecosystem inevitably faces many challenges, particularly in human resource development, as evidenced by the scarcity of professionals and the consequent limitations in resource coordination and integration (Ahmad et al., 2015; Cui et al., 2017; Li et al., 2021; Malecki, 2003). Within this context, online influencers (Wanghong in Chinese), literally meaning people who become popular on the Internet, especially online influencers in agriculture domain, have emerged as pivotal components of the rural e-commerce ecosystem. They have a strong ability to transform online traffic into cash (Abidin, 2018), thereby exerting substantial influence on agricultural product sales through online platforms (Zhong et al., 2023). Despite their growing importance, the literature remains limited in examining the specific roles and mechanisms of online influencers within the rural e-commerce ecosystem, leaving a critical gap in our understanding of their impact on ecosystem development and sustainability.

Online influencers, a burgeoning professional cohort that has emerged alongside the rapid advancement of ICT in China, have promoted the booming Wanghong economy (Craig et al., 2021; Sandel and Wang, 2022). Despite prevailing societal stereotypes that often associate online influencers with negative connotations of vulgarity and moral deficiency (Zhang and de Seta, 2018), some studies have focused on the ability of some online influencers to become online opinion leaders in specific domains through effective self-branding strategies (Wang and Feng, 2022). These online influencers are increasingly recognized as moral exemplars capable of exerting a great influence on public feelings, attitudes, opinions, etc., thereby influencing and changing consumers’ purchase intentions and behaviors (Zhong et al., 2023). Some studies have explored the influence of online influencers on consumption intentions and marketing, identifying key factors such as personal attributes, informational value, and emotional resonance (Abidin, 2015; Chang and Woo, 2019). These discussions, however, are relatively fragmented, and the prediction and identification of the main factors to change consumption intentions and behaviors caused by online influencers have not been profoundly analyzed.

To address the identified research gaps, we formulate two pivotal research questions: First, how do online influencers affect the prosperity of the rural e-commerce ecosystem? Second, what specific pathways do they utilize to influence the development of rural e-commerce? Closely related to these two research questions is the policy issue of how to more effectively train the rural e-commerce professionals and enhance their capacity, to improve farmers’ livelihoods and rural development.

We exploited online survey data with a total sample size of 1051, employed several machine learning techniques to predict consumers’ purchasing behavior toward agricultural products recommended by online influencers, and identified important influencing factors. Machine learning techniques are particularly advantageous for predictive analysis and generally recognized to perform well. Among them, random forest is popular for its good predictions and replicability (Browne et al., 2021; Htet et al., 2021; Maruejols et al., 2022, 2025). Nevertheless, we use several other machine learning algorithms in addition to random forest, specifically, gradient boosting, support vector machines (SVM), and a regularized regression approach, the LASSO regression, to compare the prediction accuracy and mechanisms with random forest by predicting purchasing behaviors using a series of socio-economic predictors.

We find that, the random forest algorithm incorporating socio-economic predictors achieves an accuracy rate of 85.12% in predicting consumer purchasing behavior. The willingness to buy the products recommended by the online influencer in agriculture has the greatest contribution to the purchasing behavior prediction, while about 23% consumers exhibit discrepancies between their stated intentions and actual purchasing behavior. High quality, good taste, and guaranteed food safety are the top reasons of purchasing intention. The second contributor is whether to follow the online influencer on certain social media, followers and admirers are found have different evaluation and thus different behavioral modes towards the online influencer. The feelings and social impacts brought by online influencers as well as the contents they generated are strongly associated with fans’ purchasing behavior.

The remainder of this study is organized as follows. Section 2 provides contextual information about rural e-commerce development in China, online influencers, and the prediction work. Section 3 and Section 4 elaborate on the machine learning techniques and the data, respectively. Section 5 offers our results and discussions, and section 6 concludes.

2. Literature review

2.1 Rural e-commerce development in China

Since 2010, the Chinese government has put forward the ‘Internet Plus’ initiative, and attached great importance to the integration of Internet and economic development. The Internet is regarded as an engine to achieve high-quality economic development, and e-commerce is considered to be an important tool to drive consumption and promote the service industry transformation (e.g. Kshetri, 2018). The ‘Internet Plus’ initiative promotes the expansion of e-commerce from cities to rural areas, and rural e-commerce is expected to invigorate urban and rural markets, promote the digital transformation of the entire agricultural industry chain, and reduce poverty.

The central government has announced a series of strong supporting policies to give priority to the development of rural e-commerce. In 2014, Poverty Alleviation e-commerce (PAeC) was officially incorporated into China’s mainstream poverty alleviation policy system and work system for the first time. In 2016, for the first time, the Internet was written into the government work report as an important part of the ‘new economy’. The No. 1 Central Document of CPC in 2017 and 2018 focused on the task of promoting the development of rural e-commerce. In 2019, the No. 1 Central document proposed the implementation of the digital rural strategy, in-depth promotion of ‘Internet plus agriculture’, and again emphasized the development of rural e-commerce. In addition, the central government has successively issued a series of policy documents to promote the development of rural e-commerce, such as the Outline of the Digital Villages Development Strategy and the Digital Agriculture and Villages Development Plan (2019–2025).

With the strong support of government policies, rural e-commerce in China is booming (Li et al., 2019). The development of Taobao villages and Taobao towns is probably the most prominent representative of the development of rural e-commerce in China (Zhou et al., 2021). From the first three Taobao villages discovered in 2009 to 5425 in 28 provinces across the country in 2020, the number and scale of development of Taobao villages have taken a qualitative leap. According to Alibaba Group, online sales in Taobao villages and Taobao towns exceeded 1 trillion CNY ($145 billion) in 2020, with 2.96 million online stores active and 8.28 million job opportunities created. The active rural online retail market has continuously spawned rural industrial clusters, shaped rural brands, and provided a steady stream of power for rural development and the improvement of rural households’ income (Cui et al., 2019; Peng et al., 202; Zhang et al., 20221; Zhou et al., 2021). It is worth noting that, the development of Taobao Village not only has the characteristics of significant spatial agglomeration, but also the spatial distribution is unbalanced (Liu et al., 2020). Rural e-commerce is mainly concentrated in the economically developed eastern regions of China, and the top five provinces with the highest number of Taobao villages, Zhejiang, Guangdong, Jiangsu, Shandong and Hebei, account for 84% of the country’s total. Some small towns and counties in less-developed western and northeastern China have started to develop Taobao villages. Overall, China’s unbalanced rural e-commerce development is undeniably an effective tool to solve the problems of rural economic and social development (e.g. Liu et al., 2021).

2.2 Online influencers and impacts on consumers’ behavior

With the boom of social media, some users develop extensive social networks that attract wide attention and visibility, and become what is referred to as online influencers. Online influencers have different labels on different social platforms as well as in different countries, such as social media influencers (SMIs), micro-celebrity, ‘Instafamous’, and Wanghong (Djafarova and Trofimenko, 2019; Freberg et al., 2011; Khamis et al., 2017; Sandel and Wang, 2022; Zhang and de Seta, 2018). Despite these different labels, online influencers share some common characteristics: a certain number of followers, high-level interactivity on their profile, and promising commercial value (Li, 2018; Zhong et al., 2023). These characteristics are contributed by creating their own unique online images, frequently interacting with other users to boost their online profiles and/or providing values (information and/or advices) to their followers (Khamis et al., 2017; Li, 2018). Online influencers, therefore, are often considered trustworthy and more accessible (Djafarova and Rushworth, 2017, 2019), and thus possessing persuasive power (Freberg et al., 2011) to influence the attitudes and behavior of their followers. In addition, online influencers also have impact on the business sector by reducing search costs as well as verification costs (e.g. Goldfarb and Tucker (2019)) through the Internet and social platforms. This influence constitutes a monetizable asset, with rational actors strategically converting online traffic into revenue streams whenever possible (Klein, 2013; Abidin, 2018). The incentive structure is amplified in the Chinese context, because the premise of becoming an online influencer (Wanghong) is to have a keen ability to convert the online traffic into cash, which is dependent more on the ability to maintain the fans’ attention visually instead of on content production (Abidin, 2018). However, some online influencers have also been condemned create and disseminate vulgar or even illegal content in order to attract attention and gain profits (Zhang and de Seta, 2018).

Naturally, online influencers can and need to parlay their online traffic and influence into commercial arrangements. Followers, also as consumers, generally consider online influencers to be more authentic and attractive, leading to perceptions of credibility and likeability of online influencers (Abidin, 2015; De Veirman et al., 2017; Jin et al., 2019). This does contribute to reducing the search costs and verification costs (Goldfarb and Tucker, 2019) and enhancing the persuasive power to shape attitudes and guide behavior of consumers (Zhong et al., 2023). Some recent studies have explored the impact of online influencers on consumer attitudes and behavior (Casaló et al., 2020; De Veirman et al., 2017; Djafarova and Rushworth, 2017, 2019; Park and Lin, 2020; Zhong et al., 2023). Park and Lin (2020) found that a match-up between the product and the online influencer promotes purchasing attitudes due to the trustworthiness of the online influencer. Djafarova and Rushworth (2017) showed that ‘Instafamous’ is influential on young female’s purchasing behavior. Zhong et al. (2023) noted that an online influencer in agriculture, who is also perceived as general online opinion leader due to her personal traits and popularity, promotes agricultural and sideline products sales and agribusiness. Generally, online influencers can promote consumption due to some welcome characteristics.

2.3 Determinants of consumers’ online shopping behavior

The development of e-commerce has reshaped the global retail paradigm. More and more consumers have become active online shoppers out of convenience, diversity, hedonism, economic and social orientation (Zhou et al., 2007). Identifying the influencing factors of consumers’ online shopping has attracted extensive attention from researchers and retailers (Clemes et al., 2014; Beckers et al., 2018; Bucko et al., 2018; Srivastava and Thaichon, 2023). Some important factors have been identified in the literature, mainly including consumer characteristics (such as age, gender and other demographic characteristics), product characteristics (such as price), merchant and intermediary characteristics (such as brand, service, etc.), environmental impact (such as market uncertainty), and media characteristics (such as information quality) (see details in Gong et al., 2013). Zhou et al. (2007) summarized the consumer characteristics related to online shopping. Demographics, internet experience, normative beliefs, shopping orientation, shopping motivation, personal traits (innovativeness), online experience, psychological perception, and online shopping experience are closely related. In the Chinese context, Sin and Tse (2002) suggested that online and offline shoppers can be distinguished by demographic, psychological, attitudinal, and empirical characteristics. Gong et al. (2013) found that consumers’ age, income, education level, marital status and perceived usefulness are important predictors of online shopping intention. Clemes et al. (2014) identified and ranked seven decision factors that determine Chinese consumers’ online shopping: perceived risk, consumer resources, service quality, subjective norms, product types, convenience and website factors.

In the context of China’s huge scale of Wanghong economy (e.g. Craig et al., 2021), in addition to the aforementioned factors, some researchers have found that online influencers have self-branded by creating their own unique online image, such as constructing a distinctive personal background story, invoking this story for sales promotion, and utilizing the visibility of the platform to enhance interaction with their followers, and thus establishing a broader brand concept (Khamis et al., 2017; Sandel and Wang, 2022; Wang and Feng, 2022). In line with the source credibility theory and use and gratifications theory (Katz and Foulkes, 1962; Umeogu, 2012), this strategy can enhance followers’ trust in online influencers and satisfy certain needs of followers, such as information, advices, preference for humorous characteristics and moral quality, so as to guide followers’ consumption attitudes and behaviors (e.g. Chang and Woo, 2019). Besides, online influencers will also have a huge influence on their followers because of the cross-cultural values, emotional values, and social values they provided (Zhong et al., 2023). Therefore, it is necessary to further discuss how online influencers in the agricultural field promote consumption in the context of the development of rural e-commerce as well as the Wanghong economy in China.

3. Methodology

Machine learning techniques are well suited for predictive tasks and the performance is generally considered to be good as expected. Among them, random forest algorithm is popular because of its good prediction performance, easy operation and reproducibility (e.g. Browne et al., 2021). Although random forest algorithm cannot clearly establish the causal relationship between variables, it can identify important factors while using small samples and maintaining high prediction accuracy (Wang et al., 2021), which can provide useful information for policy making. Some studies have employed random forest algorithm to assess poverty and malnutrition in low- and lower-middle income countries, and predicted energy poverty in India and subjective poverty of farmers in China (Browne et al., 2021; Maruejols et al., 2022, 2025; Wang et al., 2021). Following Maruejols et al. (2022), in addition to using random forest algorithm, we also explored alternative classification and prediction tools, including gradient boosting classification, support vector machines, and the LASSO regression, to compare the accuracy of predictions and the mechanisms recognized by random forest.

3.1 Random forest

The important basis of random forest algorithm is decision tree model, which is a method that has gradually received research attention for its adaptability and interpretability to nonlinear relationship. Decision tree adopts a “divide and conquer” strategy, divides the space composed of a series of features into several regions or groups, and uses the nearest neighbor to make prediction. In each step of segmentation or splitting, the decision tree model will consider the influence of feature x on the outcome y, and select a splitting variable to divide the data into different groups. Specifically, the groups that have completed the split have the highest internal purity, that is, through the selection of the split variables, the impurity within the group is greatly reduced. If continuous splitting is carried out in order to improve the accuracy of classification, however, overfitting problems will inevitably arise and the predictive power of the model will be reduced.

Random forest algorithm is an ensemble learning method based on decision tree, which overcomes the shortcomings of single decision tree and can better balance bias and variance. Through random feature selection (m<p, p is the number of features) and bootstrap sampling, several decision trees are formed and then build a random forest. Since each tree is formed using only part of the features, the correlation among trees can be reduced, thus reducing the variance as well as the total error. Therefore, the selection of parameter m is crucial for random forest algorithm, which involves a trade-off between bias and variance. The parameter can be determined by the out-of-bag (OOB) error. Generally, about 1/3 of the data does not enter the process of forming a decision tree in each bootstrap sampling, but becomes an out-of-bag sample. For classification issues, a majority vote is taken on the predictions of all decision trees and then get the predictions of the out-of-bag sample. By comparing the predictions with the observed values, the OOB error can be obtained. The principle of the random forest algorithm to determine the optimal number of trees is to minimize OOB error, that is, the OOB error will no longer decrease with the increase of the trees. After that, the importance of single feature or the ranking of predictors can be measured by averaging each tree in this random forest according to its contribution to the reduction of impurity.

Specifically, the random forest algorithm can be succinctly defined as follows:

We employ the bootstrap aggregating or bagging, to select K random samples from the training sample, thereby forming a forest composed of K trees. Here, the training sample is denoted by , where x represents the features of the training sample and y denotes the output. Initially, for k=1 to K, a bootstrap sample Sk is selected from the training sample, and a classification tree Tk is grown based on this bootstrap sample. Subsequently, through K iterations, we obtain an ensemble of K trees . Finally, letting k(x) denotes the class prediction of the k-th tree, the final prediction can be derived based on majority voting:

Equation

Citation: International Food and Agribusiness Management Review 28, 2 (2025) ; 10.22434/ifamr.1130

3.2 Gradient boosting classification

The ensemble strategy of the random forest is to decorrelate the decision trees, and another ensemble strategy, Boosting, is correspondingly to complement the decision trees. The earliest method is Adaptive Boosting (AdaBoost in short) proposed by Freund and Schapire (1996, 1997). For the classification issues, the basic idea of AdaBoost is to grow M trees, G1(x), ., GM(x) in turn. For the misclassified observations in the m-th tree, the weights are increased in the m+1-th tree by reweighting or resampling, and so on. Because of the different role of each decision tree, the relative positions of these decision trees grown sequentially cannot be changed at will. Considering that each tree in the boosting method can correct the classification error of the previous tree, the classifier is forced to pay more attention to the misclassified region in the feature space, so the bias can be reduced. At the same time, the final predictions of AdaBoost is the weighted average of many decision trees, the variance therefore can be reduced.

Statistically, the AdaBoost algorithm for binary classification is equivalent to a forward stagewise additive modeling using an exponential loss function. For the AdaBoost algorithm, each term in the final expression, , is treated as a basis function, which is formally similar to the Taylor expansion. More generally, do the basis function expansion for the learned function f(x), resulting in an additive model:

Equation

Citation: International Food and Agribusiness Management Review 28, 2 (2025) ; 10.22434/ifamr.1130

Where βm is the expansion coefficient, G(x; γm) is the basis function, and γm is the parameter vector. For example, if a decision tree is chosen as a basis function, γm means split nodes, predicted values and other parameters. To estimate βm and γm, the following objective functions can be minimized:

Equation

Citation: International Food and Agribusiness Management Review 28, 2 (2025) ; 10.22434/ifamr.1130

Where L(yi, f(xi)) is a loss function, such as squared error ( yif(xi))2, or a 0–1 loss function I(yif(xi))i, and so on. Then the forward stagewise additive modeling is used to solve the problem.

Friedman (2001) generalizes AdaBoost to the more general gradient boosting machine (GBM in short). Its innovation lies in estimating the basis function with non-parametric methods and using gradient descent for approximate solutions in the function space.

Specifically, GBM consists of three steps:

Step one: initialize , then F0(x) is the optimal constant function.

Step two: for the basis function m=1,. M, perform the following for loop:

First, calculate the quasi-residual:

Equation

Citation: International Food and Agribusiness Management Review 28, 2 (2025) ; 10.22434/ifamr.1130

Next, regression of quasi-residual ri(m) to x is performed:

Equation

Citation: International Food and Agribusiness Management Review 28, 2 (2025) ; 10.22434/ifamr.1130

Then, calculate the optimal step size:

Equation

Citation: International Food and Agribusiness Management Review 28, 2 (2025) ; 10.22434/ifamr.1130

Finally, update the function:

Fm(x) = Fm−1(x) + ρmh(x; αm)

Step three: output the result FM(x).

In the process of data operation, the method of cross-validation error can be used to determine the optimal number of decision trees and obtain the lowest prediction mean square error.

3.3 Support vector machines

Support vector machine (SVM) is a kind of classification method especially suitable for multi-variable high dimensionality. Its core idea is to find the optimal separating hyperplane to separate two classes of data. Therefore, if there is a separating hyperplane between two classes of data, the data set is linearly separable. In the case of linearly separable data, however, the separating hyperplane is not unique. We can still achieve the purpose of separating data by adjusting the position or angle of the separating hyperplane. One solution to this problem is to construct an isolation band (or margin) between the two classes of data, and the wider the isolation band, the better, so that the separating hyperplane is as far away from the two classes of data as possible. This is called the maximum margin classifier, or the widest street approach. The vector or sample points that determines the optimal separation hyperplane and the location of maximum margin is called a support vector.

The maximum margin classifier is easily affected by outliers, resulting in unrobust outcomes, at the same time, not all data are linearly separable, therefore the constraint of the maximum margin classifier can be relaxed, that is, separating hyperplane is required to separate the majority of observations correctly, while allowing a small number of observations that are misclassified (or fall within the margin). This approach is called a soft margin classifier, or support vector classifier. In this case, all sample points on the interval, within the interval, and misclassified are support vectors because they all affect the optimal solution. Soft margin classifier is less sensitive to possible outliers, so it is common to apply this approach for linear separable data as well, and then impose a penalty on the allowable misclassification limit to determine its optimal value by cross-validation.

In addition, for nonlinear separable data, it is difficult to find a linear separating hyperplane, but there is generally a nonlinear decision boundary. For this kind of data, consider transforming the eigenvector xi, such as transforming xi to φ(xi), which is a multi-dimensional or even infinite dimensional function. This means that the training sample is transformed into , in the hope that it can be linearly separable in the feature space of φ(xi). After this transformation, the estimation of the SVM depends only on the inner product of φ(xi) and φ(xj), without having to know φ(·). This inner product can be defined as a kernel function κ(xi, xj) ≡ φ(xi), φ(xj) = φ(xi)′φ(xj). Therefore, it only needs to specify the concrete form of the kernel function directly without knowing φ(·) in advance. This method is known as the kernel trick. Commonly used kernel functions include polynomial kernel, radial kernel, Laplacian kernel, sigmoid kernel and so on. After kernel function substituting, the optimal solution of SVM can be expanded by kernel function of the training sample, that is, support vector expansion.

3.4 LASSO regression

Last, a penalized regression technique of least absolute shrinkage and selection operator (LASSO) applies to high dimensional data as well, and is popular due to its function to select features that really have an impact on the outcome y. Tibshirani (1996) put forward LASSO method, its basic idea is to add a penalty term to the loss function and then carry out penalty regression:

Equation

Citation: International Food and Agribusiness Management Review 28, 2 (2025) ; 10.22434/ifamr.1130

where the first term of the loss function is the residual sum of squares (SSR), the second term is the penalty term or regularization, and λ is the tuning parameter to control the intensity of the penalty. is the L1 norm of the parameter vector, which is the sum of the absolute values of the components of x. Since the loss function includes a penalty term β. Since the loss function includes a penalty term λβ1, LASSO is also a shrinkage estimator, and its optimal solution βˆlasso shrinks more toward the origin than the OLS estimator βˆols. Further, because the sum of the absolute values of each component of β is penalized, it is also called absolute shrinkage.

Solving the LASSO minimization problem is equivalent to solving the constrained extremum problem, where the contours are easily tangent to the constraint set of LASSO, which makes some regression coefficients of the lasso estimators strictly equal to 0, resulting in sparse solution. This unique nature of LASSO gives it the function of variable selection, giving it the advantage of interpretability. The optimal solution of LASSO regression is also a function of tuning parameter λ, adjusting the penalty intensity (i.e. λ), to obtain the entire solution path or coefficient path. A common method for selecting the optimal λ is cross-validation (CV in short), and the optimal λ minimizes the cross-validation error CV(λ), that is, . LASSO has been applied in selecting features in economic analysis (e.g. Hoeschle et al., 2023; Li and Yu, 2025).

All in all, the four methods balance bias and variance, trying to reduce variance while improving the prediction accuracy, that is, to minimize the error of the prediction. The random forest and LASSO methods also have the function of variable selection, so that their predictions can be interpreted and thus revealing the mechanisms.

An importance question arises: how to select a relatively better algorithm in machine learning? The ‘No Free Lunch Theorem (NFL)’, proposed by Wolpert and Macready asserts that no algorithm is superior to others all possible problems (Wolpert and Macready 1997). It is better to compare a number of algorithms to select a relatively better one. In this logic, this study mainly investigated the prediction accuracy of random forest, and compared the predictions with the other three methods. Then, we focus on the important features selected by random forest, and conduct comprehensive analysis together with the features selected by LASSO.

4. Data

4.1 Survey design

This study focuses on a particular online influencer in agriculture, Ms. Li Ziqi, who has gained prominence and popularity at home and abroad for her beautiful videos on rural issues about cooking Chinese food, traditional Chinese handicrafts and idyllic life in rural China. Ms. Li has around 18.10 million subscribers on the international platform YouTube, 25.81 million and 48.46 million followers on two major social media platforms in China, Weibo and TikTok, respectively, in January 2024.1

Zhong et al. (2023) believed that Ms. Li is an online influencer as well as an online opinion leader, who could promote the sales of agricultural products by influencing the feelings, thoughts and opinions of her followers. In fact, Ms. Li does have a very significant business value and brand value. In 2018, Ms. Li used her own name to launch an official flagship store on Taobao’s T-mall. In just six days, her online store achieved sales of more than 10 million CNY. Her brand’s sales reached 1.6 billion CNY in 2020. One product, River snails rice noodle, is still in short supply, with production reaching 3 million bags a day. Ms. Li’s success provides a valuable case study for our attempt to discuss the impact of online influencers on agribusiness and rural e-commerce development.

We exploit primary data from an online survey conducted through the largest commercial online survey company in China between January 16–19, 2020. A simple random sampling method was applied. A total of 1051 samples were obtained and the samples were validated by time spent completing the survey and the logic of the responses, resulting in 953 valid samples (90.68%). Of the 953 samples, 721 respondents have heard of Ms. Li, and after removing the samples with missing values, a final 720 samples (75.55%) were preserved, which formed the sample for analysis in this study.

4.2 Data description

Our sample distribution is generally in line with the population migrating trend in China, with more than half (about 51%) of the samples living in the eastern region, 23% in the central region, and the rest in the less developed western and northeastern regions.

Following the findings of the literature on influencing factors of consumer behavior in Section 2.3, we selected a series of socio-economic predictors, including consumption willingness, consumers’ attention paying to the online influencer in agriculture and their cost of attention, consumers’ online experiences, their demographic characteristics and preferences. 80.4% of the respondents indicated their willingness to buy agricultural products recommended by Ms. Li. A total of 69.3% of them followed Ms. Li on at least one social platform, and 57.9% reported their admiration to Ms. Li. 88.2% of the respondents believed that the agricultural products recommended by Ms. Li were safe, which may be related to their high evaluation of the good feelings that Ms. Li brought to them (average score 7.0) and the social value generated by her (average score 7.1). Respondents had considerable online experiences, including online agricultural products purchasing, with 44.7% of them having purchased agricultural products recommended by other online influencers, and daily spending three to five hours online.

The sample was mainly comprised of female respondents (63.3%), younger users (average age 30.2 years), and most are married and nearly have one kid. 70.1% of them were urban citizens, and the sample was primarily comprised of well-educated respondents, as more than 90% had a bachelor’s degree or above. More than half of the respondents earned a salary between 5000 and 12 000 CNY, which might partly result from their higher educational attainment. In addition, respondents showed an overall appetite for risk (average self-rating 5.6), a high degree of patience (average self-rating 7.0), and a relatively strong tendency toward self-reported altruism (Table 1).

Definition and descriptive information for variables
Table 1.

Definition and descriptive information for variables

Citation: International Food and Agribusiness Management Review 28, 2 (2025) ; 10.22434/ifamr.1130

5. Results and discussion

5.1 Full sample predictions

As a basis for analysis and prediction, the full sample was split into a training set (70% training data, n=505) and a test set (30% test data, n=215). The predictions of random forest algorithm are our main focus. First, the optimal number of decision trees to reduce misclassification errors needs to be determined. As can be seen in Figure 1, the minimum OOB error was obtained when 174 trees are grown, that is, the machine learning technique after training was able to make the most accurate prediction for the test sample.

Optimal tree number for the random forest algorithm. The red line represents the out-of-bag (OOB) error for non-purchasers, the green line indicates the OOB error for purchasers, and the black line denotes the OOB error for the entire sample. As can be observed from the figure, the OOB error for the entire sample ceases to decrease with the addition of more trees after reaching 174 trees. Therefore, the minimum OOB error can be obtained when 174 trees are grown.
Figure 1.

Optimal tree number for the random forest algorithm. The red line represents the out-of-bag (OOB) error for non-purchasers, the green line indicates the OOB error for purchasers, and the black line denotes the OOB error for the entire sample. As can be observed from the figure, the OOB error for the entire sample ceases to decrease with the addition of more trees after reaching 174 trees. Therefore, the minimum OOB error can be obtained when 174 trees are grown.

Citation: International Food and Agribusiness Management Review 28, 2 (2025) ; 10.22434/ifamr.1130

Table 2 provides the prediction accuracy information of the random forest algorithm. It shows that the minimum OOB error was 23.17%, indicating that the prediction accuracy of the random forest algorithm on the training set was 76.83%. When the best-trained model was applied on the test set, however, the prediction accuracy was greatly improved to 85.12%. In other words, the trained random forest algorithm could accurately predict the behavior of consumers to purchase agricultural products recommended by online influencers, with a high prediction accuracy of 85.12%. It had practical significance for online influencers and related enterprises to improve their marketing strategies, and thus better targeting those who have the willingness and ability to purchase agricultural products for marketing. Efficient marketing strategies will contribute to boosting online sales of agricultural products.

Predictions for purchasing agricultural products online
Table 2.

Predictions for purchasing agricultural products online

Citation: International Food and Agribusiness Management Review 28, 2 (2025) ; 10.22434/ifamr.1130

Three other machine learning algorithms also provided good predictions, but none achieved the same prediction accuracy as random forest computing. First, the gradient boosting classification algorithm found the lowest misclassification error of 19.21% when growing 300 trees, indicating that the trained model could provide 80.79% prediction accuracy on the training set, but the accuracy on the test set was slightly decreased to 80.47%, which was less than expected. Second, the best performance of the SVM technique using 10-fold CV was to provide a minimum of 20.61% misclassification error. After running the trained model on the test set, the prediction accuracy was improved significantly from 79.39% on the training set to 82.33%. The prediction accuracy was still insufficient compared with 85.12% in the random forest algorithm. Third, the LASSO method shrinks the coefficients of the features, selecting and keeping the most important predictors for regression. In our sample, 5 out of the 19 predictors were selected and were then able to achieve a minimum misclassification error of 19.80%. However, using these five predictors can only achieve 80.47% accuracy on the test set, which, like the gradient boosting classification algorithm, did not produce good predictions.

In general, the random forest algorithm performed better than other machine learning techniques, with the highest prediction accuracy. In addition, random forest algorithms can provide clear information on important predictors, which can help improve marketing strategies and cultivate rural e-commerce professionals more targeted, thus providing valuable information for policy making. The next section will comprehensively consider the important variables selected by the random forest algorithm and LASSO technique, to investigate the key predictors that affect consumers’ behavior of purchasing agricultural products recommended by online influencers.

5.2 Feature importance and economic explanations

An important step in random forest algorithm is to reduce the intra-group impurity after splitting. Specifically, we consider the reduction of Gini impurity after data grouping. Gini impurity indicates the probability that two observations from the same dataset are differently categorized, at any node t, we can derive that Gini = Ø(t) = ∑j p(j|t) (1 − p(j|t)) = 1 − ∑j p2(j|t), where p(j|t) is the proportion of class j observations in node t. If a feature can significantly reduce Gini impurity, it makes an important contribution to the correct prediction.

In our study, 7 out of the 19 socio-economic predictors significantly contributed to reducing Gini impurity (see Figure 2), with consumers’ willingness to purchase agricultural products recommended by Ms. Li and whether to follow Ms. Li on certain social media being the two most significant predictors. LASSO logit regression retained the five factors that were most explanatory of consumer purchasing behavior, which together explained 32.2% of the total variance, while all 19 features could explain 40.4% of the total variance. The five selected features are consumers’ willingness to purchase, whether to follow Ms. Li on certain social media, the frequency of online agricultural products shopping, whether to buy products recommended by other online influencers in this domain, and hours spent on searching information about Ms. Li every week.

Variable importance.
Figure 2.

Variable importance.

Citation: International Food and Agribusiness Management Review 28, 2 (2025) ; 10.22434/ifamr.1130

Interestingly, both the random forest algorithm and LASSO regression selected willingness to purchase and social platform attention as the most critical predictors and influencing factors. Meanwhile, the frequency of online agricultural products shopping and hours spent on searching information about Ms. Li every week are also important features for both techniques to select (see Table 3). Since the selection of important variables given by the two algorithms has a relatively high overlap, it can be reasonably inferred that our findings on important predictors have certain robustness, and these important features can support the correct prediction of consumers’ online shopping behavior of agricultural products.

Variable selection
Table 3.

Variable selection

Citation: International Food and Agribusiness Management Review 28, 2 (2025) ; 10.22434/ifamr.1130

Table 4 provides descriptive information on the important variables in the purchasing and non-purchasing groups. With the exception of the variable age, all other features showed significant differences between the two groups, and not surprisingly, the features of the purchasing group were higher or better than those of the non-purchasing group. This could again indicate that our important variables had a strong explanatory power for consumers’ online shopping of agricultural products recommended by online influencers.

Descriptive statistics of important variables
Table 4.

Descriptive statistics of important variables

Citation: International Food and Agribusiness Management Review 28, 2 (2025) ; 10.22434/ifamr.1130

Based on the prediction results of the random forest algorithm, specifically, the most important predictor of consumers’ purchase of agricultural products recommended by online influencers is consumers’ purchase intention or willingness, which is consistent with economic theory, especially the theory of planned behavior (TPB), as well as the empirical findings (Javadi et al., 2012; Lim et al., 2016). According to the TPB proposed by Ajzen (1991), consumers’ behavioral intention directly affects the generation of consumption behavior, and behavioral intention is influenced by behavioral attitude, subjective norms, perceived behavioral control and other factors. Further investigating the reasons why the respondents were willing to purchase the agricultural products recommended by Ms. Li, it could be found that high quality (55.3%), good taste (53.8%), and guaranteed food safety (35.4%) were the top reasons of purchasing intention. In addition, 29% of respondents also self-reported that purchasing agricultural products recommended by Ms. Li reflected their support for agricultural development and increasing farmers’ income, while 28% agreed that their purchases were in support of Ms. Li. However, it was also worth noting that although 80.4% of respondents declared their intention to purchase, 23% of them did not translate this willingness into consumption behavior.

Whether to follow Ms. Li on certain social media was the second important predictor selected by the random forest algorithm. We speculated that a possible explanation was that the social media platform provided a high frequency of information interaction and emotional communication. On the social media platform, followers learned about Ms. Li herself through video contents, comments and communication, observed the production process of agricultural products recommended by Ms. Li, and thus making a good judgment on product quality and taste. Following online influencers on social media platforms can help eliminate information asymmetry for consumers and reduce their search and verification costs (Goldfarb and Tucker, 2019). This was in line with the findings of Zhong et al. (2023). They found that the consumption behavior decisions of followers were more likely to be affected by online influencers, which might be closely related to the credibility of online influencers (Kwon and Song, 2015) and the reduction of transaction costs.

Another important predictor included consumers’ age. Many studies have found that consumers in different age groups had different purchasing decisions, especially the young and the elderly showed different preferences and attitudes towards online shopping (Clemes et al., 2014; Gong et al., 2013). And there was a difference in the acceptance of online influencers as a recent emerging thing between young and old people, so it could be expected that younger users might be more willing and more frequent to purchase agricultural products recommended by online influencers. Consumers’ overall evaluation of online influencers was also a valuable predictor, which related to the social impacts generated by online influencers, like the promotion of rural entrepreneurship or rural tourism, and the feelings brought by online influencers to respondents, such as the happiness, relief or inspiring through video contents production. This was logical because consumers’ positive evaluation of online influencers might give rise to their trust, admire, love and other emotions towards the online influencer, which contributed to enhancing their purchase intention and increase their purchase behavior. The cost of consumers’ time to follow a particular online influencer was another useful predictor, and it was easy to understand that the more time and attention consumers spend following Ms. Li, the more likely they were to receive the information they wanted and the more likely they were to be persuaded by Ms. Li to purchase the products she recommended. Finally, consumers’ previous online shopping experiences also contributed to accurate predictions. It could be reasonably speculated that the more times consumers purchased agricultural products online, the more experience and ability they had to judge whether the agricultural products recommended by the online influencer were worth purchasing, so as to carry out actual consumption.

Generally, our predictive results suggest that consumers’ behavioral decisions to purchase products recommended by online influencers were largely dependent on their purchase intentions and whether to follow the online influencer on certain social media. The age of consumers, their value evaluation of the online influencer, the attention paid to the online influencer, and consumers’ online shopping experiences were important predictors of online shopping agricultural products decision as well. These findings can provide a new perspective for us to understand how online influencers promote the online sales of agricultural products, the mechanisms might involve paying attention to the brand of online influencers themselves, bringing more emotional values to consumers through high-quality content production, and attaching importance to providing social values. At the same time, our conclusions can also provide valuable references for promoting the development of rural e-commerce and cultivating related professionals.

5.3 Focus on the social media attention

There have been plentiful discussions about how purchase intentions affect consumer behavior, while social media following is a key predictor of great interest. We have observed that social media attention can be an important factor influencing consumers’ purchasing decisions. Further, the sample was split into follower group (n=499) and non-follower group (n=221), to investigate the behavioral decisions and possible mechanisms of consumers of different groups in purchasing agricultural products recommended by online influencers. In addition, we also noted that social media attention was different from but closely related to the respondents’ admiration for Ms. Li. Therefore, as a reference, the sample was also split into fans group (those who self-reported admiration for Ms. Li was higher than the sample average, n=417) and non-fans group (n=303), according to the respondents’ admiration for Ms. Li.

The random forest algorithm gave the prediction results (see Table 5). We found that the misclassification error of consumers in the follower group purchasing agricultural products recommended by online influencers was 19.89%, and the prediction accuracy on the test set (still 30% test data) is 83.80%, which was also a relatively high accuracy. The prediction accuracy of the random forest algorithm for the fans group was still good, which was 81.67%, while the prediction accuracy of the non-fans group as well as the non-follower group was significantly decreased. One possible explanation was that the sample sizes of the non-fans and non-follower groups were relatively small after splitting, reducing the predictive effectiveness of machine learning. But in general, random forest algorithm could still provide specific mechanism information by selecting important variables.

Random Forest predictions by following group
Table 5.

Random Forest predictions by following group

Citation: International Food and Agribusiness Management Review 28, 2 (2025) ; 10.22434/ifamr.1130

The top five key predictors that effect consumer behavior in purchasing agricultural products recommended by online influencers across different groups were shown in Table 6. Purchase intention, multidimensional evaluations of Ms. Li, and age were important common features, and purchase frequency and risk preference had heterogeneous impacts on different groups, and whether to follow Ms. Li on social media was a key predictor of the behavior of non-fans group. These results reflected the heterogeneity of the online agricultural products shopping behaviors in different groups, and those behaviors were affected by different key features. There were also differences in the magnitude of the impact of these features on consumer behaviors. For follower group, purchase intention was the most important predictor, while feelings brought by Ms. Li were the most important influencing mechanism for non-follower group. The most critical predictor for fans group was age, and the behavior of the non-fans group was mainly predicted by whether to followed Ms. Li on certain social media.

Variable selection by following group
Table 6.

Variable selection by following group

Citation: International Food and Agribusiness Management Review 28, 2 (2025) ; 10.22434/ifamr.1130

Table 7 provides descriptive information on these predictors, which could help us to have a fuller understanding of the features of follower group and fans group. The information in Table 7 shows that the followers and non-followers had greater differences in purchase intention and purchase frequency, while the fans and non-fans had greater differences in the evaluation of emotional values and social values brought by Ms. Li, as well as risk preference. In other words, the respondents in follower group may be relatively rational, they paid attention to experiences and purchasing ability, while the respondents in fans group may be more emotional, easy to be driven by their own feelings and opinions, and may conduct impulsive consumption under the recommendation of online influencers. These findings had practical values for online influencers and related enterprises to improve their marketing strategies and promote the online sales of agricultural products.

Descriptive statistics of important variables by following group
Table 7.

Descriptive statistics of important variables by following group

Citation: International Food and Agribusiness Management Review 28, 2 (2025) ; 10.22434/ifamr.1130

6. Conclusions and policy implications

The persistent rural population migration has precipitated significant rural decay, underscoring the imperative to transform China’s rural economic and social structures. The prosperity of the rural e-commerce system presents a promising avenue to stimulate online sales of agricultural products, enhance farmers income, foster rural entrepreneurship and revitalize rural development. Online influencers, due to their unique online profiles and strong motivation to seek to monetize online traffic, have the ability to influence consumers’ purchase intentions and behaviors, and thus have significant brand and commercial value. Despite their potential significance as key stakeholders in the rural e-commerce ecosystem, the extant literature has not sufficiently elucidated the specific roles and mechanisms through which online influencers operate. Addressing this research gap, our study employs four machine learning techniques — random forest algorithm, gradient boosting classification, support vector machine, LASSO regression — to systematically predict consumer behaviors regarding purchasing agricultural products recommended by online influencers and to identify underlying mechanisms. Furthermore, our analysis provides insights into the role of online influencers in facilitating the development of the rural e-commerce ecosystem.

The random forest algorithm incorporating a comprehensive set of socioeconomic predictors, achieves superior predictive performance with an accuracy rate of 85.12%, significantly outperforming the other three techniques. Seven features out of the 19 predictors were selected by the random forest algorithm as key predictors based on their substantial contribution to reducing Gini impurity within the classification groups. LASSO regression identified five important variables that coincide with the selection of the random forest algorithm.

Purchase intention and whether to follow online influencers on social media were identified by both techniques as the two features that contributed the most to improving prediction accuracy. High quality, good taste and guaranteed food safety were the main reasons for increasing purchase intention. Following online influencers on social media emerged as particularly noteworthy, as it appears to mitigate information asymmetry and enhance emotional interactions. Furthermore, both techniques identified shopping frequency for agricultural products and time investment in seeking information about Ms. Li as significant predictors. In addition, age, consumers’ evaluation and views on the emotional value brought and the social value generated by online influencers were also key predictors of consumers’ purchase of agricultural products recommended by online influencers.

Given the novel nature of social media following as a predictor, we conducted stratified analyses by segmenting the sample into distinct groups: followers versus non-followers, and fans versus non-fans based on respondents’ level of admiration for online influencers. The random forest algorithm has achieved good prediction performance on the subsets of the follower group and the fans group. Variable importance analysis revealed that the purchase intention, multidimensional evaluation of Ms. Li and age emerged as significant common features, while the purchase frequency and risk preference exhibited heterogeneous effects on different groups, and whether to follow the online influencer on social media was identified as the primary predictor for the non-fans group.

Our conclusions have practical implications for online influencers and associated enterprises seeking to optimize their marketing strategies. First, the strong association between purchase intention and product quality, taste, and food safety underscores the importance of building consumer trust, particularly given the credence goods nature of agricultural products. To enhance trust and purchase intention, online influencers should prioritize the creation of authentic, high-quality video content that comprehensively addresses product quality and safety attributes, ensuring information legitimacy, authenticity, relevance, and comprehensiveness.

Second, consumers’ social media following and attention input to online influencers are important factors in predicting purchasing behavior. Online influencers and their teams should implement targeted strategies to increase consumer attention and platform following. This could include incorporating explicit calls-to-action in video content, encouraging audience interaction through reposts and comments, and fostering ongoing engagement with followers. Such strategies not only improve predictive metrics but also expand the online influencer’s social reach and impact.

Finally, consumers, especially as followers and fans, attach great importance to the emotional value and social value that online influencers can generate. Therefore, online influencers and their teams can bring more happiness and relief to their followers by creating more videos that reflect the quiet rural life, warm relationships, anecdotes and other contents. They can also contribute to social development by participating in more public welfare undertakings, carrying out more knowledge popularization, and communicating with followers at home and abroad, so as to enhance the trust and ‘stickiness’ of followers and fans.

From a policy perspective, our findings offer valuable insights for fostering the development of rural e-commerce in China and cultivating related professionals in this domain. Policymakers should consider establishing comprehensive training programs that leverage successful case studies of prominent online influencers. These programs should focus on key competencies including content generation, attention attraction, value creation, related policies and regulations system, etc., particularly targeting key demographics such as returning college graduates, young farmers, and agricultural cooperative leaders. Further, measures such as tax relief measures should be taken to encourage representative online influencers to carry out various kinds of cooperation with local farmers, such as holding relevant agricultural product exhibitions to help small-scale farmers sell their products online. To better stimulate the rural e-commerce ecosystem, the government should also support the construction of rural electronic information infrastructure, improve the rural logistics system, and enhance the digital literacy of rural residents, so as to promote the sustainable development of rural e-commerce.

It is imperative to recognize that Ms. Li’s case constitutes an exceptional success story within the context of China. However, related analysis must account for the inherent survivorship bias, as empirical evidence indicates a substantial failure rate among online influencers and their associated e-commerce initiatives. The case’s distinctive characteristics, particularly its first-mover advantages and platform-specific supports, have established path dependencies that may not be generalizable to other contexts. Researchers and policymakers must exercise caution to avoid overgeneralization from Ms. Li’s unique success, attempt to prioritize the examination of cases that reflect the median experience of rural e-commerce participants, while simultaneously identify insights from this success story.

The contributions of this study are twofold. We innovated by using several machine learning techniques to predict consumers’ online agricultural purchasing behavior and identified important mechanisms. And we further expand researches on the impact of online influencers on the business sector from the perspective of promoting online agricultural sales. However, there are still some shortcomings in this study, for example, the total number of samples is limited, and the prediction accuracy of the subsets can be improved. Future research can continue to deepen from the perspective of increasing the number of typical cases, quantitatively analyzing important mechanisms such as purchase intention, social media attention, and value creation, and further discussing how different social media attention or following on different social media platforms affect consumers’ purchasing behaviors of agricultural products recommended by online influencers.

In addition, the results also support the No-Free-Lunch-Theorem, and we should compare a number of algorithms in practice of machine learning problems to identify a relatively better algorithm.

References

  • Abidin, C. 2015. Communicative intimacies: influencers and perceived interconnectedness. Ada 8: 116.

  • Abidin, C. 2018. What is an Internet celebrity anyway? In Internet Celebrity: Understanding Fame Online (Society Now). Emerald, Bingley, pp. 118.

  • Ahmad, S.Z., A.R. Abu Bakar, T.M. Faziharudean and K.A.M. Zaki. 2015. An empirical study of factors affecting e-commerce adoption among small-and medium-sized enterprises in a developing country: Evidence from Malaysia. Information technology for Development 21: 555572.

    • Über Google Scholar suchen
    • Zitierung exportieren
  • Ajzen, I. 1991. The theory of planned behavior. Organizational Behavior and Human Decision Processes 50: 179211.

  • Atasoy, H. 2013. The effects of broadband internet expansion on labor market outcomes. ILR Teview 66: 315345.

  • Beckers, J., I. Cárdenas and A. Verhetsel. 2018. Identifying the geography of online shopping adoption in Belgium. Journal of Retailing and Consumer Services 45: 3341.

  • Browne, C., D.S. Matteson, L. McBride, L. Hu, Y. Liu, Y. Sun, J. Wen and C.B. Barrett. 2021. Multivariate random forest prediction of poverty and malnutrition prevalence. PLoS ONE 16: e0255519.

    • Über Google Scholar suchen
    • Zitierung exportieren
  • Bucko, J., L. Kakalejčík and M. Ferencová. 2018. Online shopping: factors that affect consumer purchasing behaviour. Cogent Business and Management 5: 1535751.

  • Casaló, L., C. Flavián and S. Ibáñez-Sánchez. 2020. Influencers on Instagram: Antecedents and consequences of opinion leadership. Journal of Business Research 117: 510519.

  • Chang, E. and T. Woo. 2019. The influence of Internet celebrities (Wanghongs) on social media users in China. Proceedings CERC, pp. 373379.

  • Couture, V., B. Faber, Y. Gu and L. Liu. 2021. Connecting the countryside via e-commerce: evidence from China. American Economic Review: Insights 3: 3550.

  • Craig, D., J. Lin and S. Cunningham. 2021. Wanghong as Social Media Entertainment in China. Palgrave Macmillan, London.

  • Cui, M., S.L. Pan, S. Newell and L. Cui. 2017. Strategy, resource orchestration and e-commerce enabled social innovation in Rural China. The Journal of Strategic Information Systems 26: 321.

    • Über Google Scholar suchen
    • Zitierung exportieren
  • Cui, M., S.L. Pan and L. Cui. 2019. Developing community capability for e-commerce development in rural China: A resource orchestration perspective. Information Systems Journal 29: 953988.

    • Über Google Scholar suchen
    • Zitierung exportieren
  • De Veirman, M., V. Cauberghe and L. Hudders. 2017. Marketing through Instagram influencers: the impact of number of followers and product divergence on brand attitude. International Journal of Advertising 36: 798828.

    • Über Google Scholar suchen
    • Zitierung exportieren
  • Djafarova, E. and C. Rushworth. 2017. Exploring the credibility of online celebrities’ Instagram profiles in influencing the purchase decisions of young female users. Computers in Human Behavior 68: 17.

    • Über Google Scholar suchen
    • Zitierung exportieren
  • Djafarova, E. and O. Trofimenko. 2019. ‘Instafamous’–credibility and self-presentation of micro-celebrities on social media. Information, Communication and Society 22(10): 14321446.

    • Über Google Scholar suchen
    • Zitierung exportieren
  • Freberg, K., K. Graham, K. McGaughey and L.A. Freberg. 2011. Who are the social media influencers? A study of public perceptions of personality. Public Relations Review 37(1): 9092.

    • Über Google Scholar suchen
    • Zitierung exportieren
  • Freund, Y. and R.E. Schapire. 1996. Experiments with a new boosting algorithm. Machine Learning: Proceedings of the Thirteenth International Conference, Morgan Kauffman, San Francisco, CA, pp. 148156.

    • Über Google Scholar suchen
    • Zitierung exportieren
  • Freund, Y. and R.E. Schapire. 1997. A decision–theoretic generalization of on–line learning and an application to boosting. Journal of Computer and System Sciences 55: 119139.

    • Über Google Scholar suchen
    • Zitierung exportieren
  • Friedman, J.H. 2001. Greedy function approximation: a gradient boosting machine. Annals of statistics 11891232.

  • Goldfarb, A. and C. Tucker. 2019. Digital economics. Journal of Economic Literature 57(1): 343.

  • Gong, W., R. Stump and L. Maddox. 2013. Factors influencing consumers’ online shopping in China. Journal of Asia Business Studies 7: 214230.

  • Höschle, L., S. Trestini and E. Gaimpietri. 2023. Participation in a mutual fund covering losses due to pest infestation: analyzing key predictors of farmers’ interest through machine learning. International Food and Agribusiness Management Review 26(3): 535554.

    • Über Google Scholar suchen
    • Zitierung exportieren
  • Htet, N.L., W. Kongprawechnon, S. Thajchayapong and T. Isshiki. 2021. Machine learning approach with multiple open–source data for mapping and prediction of poverty in Myanmar. In 18th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI–CON), pp. 10411045.

    • Über Google Scholar suchen
    • Zitierung exportieren
  • Javadi, M.H.M., H.R. Dolatabadi, M. Nourbakhsh, A. Poursaeedi and A. Asadollahi. 2012. An analysis of factors affecting on online shopping behavior of consumers. International Journal Of Marketing Studies 4: 8198.

    • Über Google Scholar suchen
    • Zitierung exportieren
  • Jin, S.V., A. Muqaddam and E. Ryu. 2019. Instafamous and social media influencer marketing. Marketing Intelligence and Planning 37: 567579.

  • Karine, H. 2021. E-commerce development in rural and remote areas of BRICS countries. Journal of Integrative Agriculture 20: 979997.

  • Katz, E. and D. Foulkes. 1962. On the use of the mass media as “escape”: Clarification of a concept. Public Opinion Quarterly 26: 377388.

  • Khamis, S., L. Ang and R. Welling. 2017. Self–branding, ‘micro–celebrity’ and the rise of social media influencers. Celebrity Studies 8: 191208.

  • Klein, J. 2013. Reputation economics: Why who you know is worth more than what you have. Macmillan, London.

  • Kshetri, N. 2018. Rural e-commerce in developing countries. IT Professional 20: 9195.

  • Kwon, Y. and H.R. Song. 2015. The Role of opinion leaders in influencing consumer behaviors with a focus on market mavens: a meta–analysis. Athens Journal of Mass Media and Communications 3: 4354.

    • Über Google Scholar suchen
    • Zitierung exportieren
  • Lele, U. and S. Goswami. 2017. The fourth industrial revolution, agricultural and rural innovation and implications for public policy and investments: a case of India. Agricultural Economics 48: 87100.

    • Über Google Scholar suchen
    • Zitierung exportieren
  • Li, L., K. Du, W. Zhang and J.-Y. Mao. 2019. Poverty alleviation through government–led e-commerce development in rural China: an activity theory perspective. Information Systems Journal 29: 914952.

    • Über Google Scholar suchen
    • Zitierung exportieren
  • Li, L., Y. Zeng, Z. Ye and H. Gao. 2021. E-commerce development and urban–rural income gap: Evidence from Zhejiang Province, China. Papers in Regional Science 100: 475494.

  • Li, R. 2018. The secret of internet celebrities: A qualitative study of online opinion leaders on Weibo. Proceedings of the 51st Hawaii International Conference on System Sciences.

  • Li, X., H. Guo, S. Jin, W. Ma and Y. Zhang. 2021. Do farmers gain internet dividends from E-commerce adoption? Evidence from China. Food Policy 101: 102024.

  • Li, Y. and X. Yu. 2025. Attribute non-attendance in the choice experiment with machine learning: WTP for organic apples in Germany. Forthcoming in International Food and Agribusiness Management Review, in press. https://doi.org/10.22434/IFAMR.1133

    • Über Google Scholar suchen
    • Zitierung exportieren
  • Lim, Y.J., A. Osman, S.N. Salahuddin, A.R. Romie and S. Abdullah. 2016. Factors influencing online shopping behavior: the mediating role of purchase intention. Procedia Economics and Finance 35: 401410.

    • Über Google Scholar suchen
    • Zitierung exportieren
  • Liu, M., Q. Zhang, S. Gao and J. Huang. 2020. The spatial aggregation of rural e-commerce in China: An empirical investigation into Taobao Villages. Journal of Rural Studies 80: 403417.

    • Über Google Scholar suchen
    • Zitierung exportieren
  • Liu, M., S. Min, W. Ma and T. Liu. 2021. The adoption and impact of E-commerce in rural China: Application of an endogenous switching regression model. Journal of Rural Studies 83: 106116.

    • Über Google Scholar suchen
    • Zitierung exportieren
  • Liu, Y. and Y. Li. 2017. Revitalize the world’s countryside. Nature 548: 275277.

  • Malecki, E.J. 2003. Digital development in rural areas: Potentials and pitfalls. Journal of Rural Studies 19: 201214.

  • Maruejols, L., H. Wang, Q. Zhao, Y. Bai and L. Zhang. 2023. Comparison of machine learning predictions of subjective poverty in rural China. China Agricultural Economic Review 15: 379399.

    • Über Google Scholar suchen
    • Zitierung exportieren
  • Maruejols, L., L. Höschle and X. Yu. 2025. Energy independence, rural sustainability and potential of bioenergy villages in Germany: machine learning perspectives. International Food and Agribusiness Management Review, in press. https://doi.org/10.22434/ifamr1132.

    • Über Google Scholar suchen
    • Zitierung exportieren
  • Park, H. and L. Lin. 2020. The effects of match–ups on the consumer attitudes toward internet celebrities and their live streaming contents in the context of product endorsement. Journal of Retailing and Consumer Services 52: 101934.

    • Über Google Scholar suchen
    • Zitierung exportieren
  • Peng, C., B. Ma and C. Zhang. 2021. Poverty alleviation through e-commerce: Village involvement and demonstration policies in rural China. Journal of Integrative Agriculture 20: 9981011.

    • Über Google Scholar suchen
    • Zitierung exportieren
  • Qi, J., X. Zheng and H. Guo. 2019. The formation of Taobao villages in China. China Economic Review 53: 106127.

  • Sandel, T. and Y. Wang. 2022. Selling intimacy online: the multi–modal discursive techniques of China’s wanghong. Discourse, Context and Media 47: 100606.

  • Sin, L. and A. Tse. 2002. Profiling internet shoppers in Hong Kong: demographic, psychographic, attitudinal and experiential factors. Journal of International Consumer Marketing 15: 729.

    • Über Google Scholar suchen
    • Zitierung exportieren
  • Srivastava, A. and P. Thaichon. 2023. What motivates consumers to be in line with online shopping?: a systematic literature review and discussion of future research perspectives. Asia Pacific Journal of Marketing and Logistics 35: 687725.

    • Über Google Scholar suchen
    • Zitierung exportieren
  • Tang, W. and J. Zhu. 2020. Informality and rural industry: Rethinking the impacts of E-commerce on rural development in China. Journal of Rural Studies 75: 2029.

  • Tibshirani, R. 1996. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society Series B: Statistical Methodology 58: 267288.

  • Umeogu, B. 2012. Source credibility: a philosophical analysis. Open Journal of Philosophy 2: 112115.

  • Wang, H., L. Maruejols and X. Yu. 2021. Predicting energy poverty with combinations of remote–sensing and socioeconomic survey data in India: evidence from machine learning. Energy Economics 102: 105510.

    • Über Google Scholar suchen
    • Zitierung exportieren
  • Wang, Y. and D. Feng. 2022. Identity performance and self–branding in social commerce: A multimodal content analysis of Chinese wanghong women’s video–sharing practice on TikTok. Discourse, Context and Media 50: 100652.

    • Über Google Scholar suchen
    • Zitierung exportieren
  • Wolpert, D.H. and W.G. Macready. 1997. ‘No Free Lunch Theorems for Optimization’. IEEE Transactions on Evolutionary Computation 1(1): 6782. https://doi.org/10.1109/4235.585893.

    • Über Google Scholar suchen
    • Zitierung exportieren
  • World Bank Group. 2016. World Development Report 2016: Digital Dividends. World Bank, Washington, DC.

  • Yu, H. and L. Cui. 2019. China’s e-commerce: empowering rural women?. The China Quarterly 238: 418437.

  • Zapata, S.D., C.E. Carpio, O. Isengildina-Massa and R.D. Lamie. 2013. The economic impact of services provided by an electronic trade platform: The case of MarketMaker. Journal of Agricultural and Resource Economics 38: 359378.

    • Über Google Scholar suchen
    • Zitierung exportieren
  • Zhang, G. and G. de Seta. 2018. Being “red” on the internet. In Microcelebrity around the globe. Emerald, Bingley, pp. 5767.

  • Zhang, Y., H. Long, L. Ma, S. Tu, Y. Li and D. Ge. 2022. Analysis of rural economic restructuring driven by e-commerce based on the space of flows: the case of Xiaying village in central China. Journal of Rural Studies 93: 196209.

    • Über Google Scholar suchen
    • Zitierung exportieren
  • Zhong, X., J. Wang and X. Yu. 2023. Internet celebrities, public opinions and food system change in China: a new conceptual framework. International Food and Agribusiness Management Review 26: 467487.

    • Über Google Scholar suchen
    • Zitierung exportieren
  • Zhou, J., L. Yu and C.L. Choguill. 2021. Co–evolution of technology and rural society: the blossoming of taobao villages in the information era, China. Journal of Rural Studies 83: 8187.

    • Über Google Scholar suchen
    • Zitierung exportieren
  • Zhou, L., L. Dai and D. Zhang. 2007. Online shopping acceptance model – A critical survey of consumer factors in online shopping. Journal of Electronic Commerce Research 8: 4162.

    • Über Google Scholar suchen
    • Zitierung exportieren
  • Zhu, B., Y. Song, G. Li. 2016. Spatial aggregation pattern and influencing factors of “Taobao village” in China under the C2C e-commerce mode. Economic Geography 36(4): 9298.

Corresponding author

1

All the data of followers/subscribers were accessed online 10 January 2024.

Kennzahlen

Insgesamt Letzte 365 Tage In den letzten 30 Tagen
Aufrufe von Kurzbeschreibungen 0 0 0
Gesamttextansichten 1139 789 50
PDF-Downloads 1775 1339 29