همبستگی کاذب و خاصیت بسته بودن داده‌های ترکیبی در علوم زمین

محورهای موضوعی :

1 - دانشگاه کاشان

تاریخ دریافت : 1403/06/06 تاریخ پذیرش : 1403/06/06 تاریخ انتشار : 1403/06/06

کلید واژه: داده‌های ترکیبی و خاصیت بسته بودن آنها, تبدیل‌های لگاریتم نسبتی, روش‌های آماری استوار , همبستگی کاذب ,

چکیده مقاله :

داده‌های ترکیبی که معمولا نتیجه اندازه‌گیری‌ها در علوم زمین هستند، خاصیت مهمی به نام بسته بودن دارند. پژوهشگرانی که بدون توجه به این خاصیت، از روش‌‌های متداول آماری با اعمال تبدیل‌ لگاریتمی برای تعدیل چولگی و یا نرمال‌کردن داده‌ها استفاده می‌کنند در واقع وجود همبستگی کاذب در داده‌‌های ترکیبی را نادیده‌ می‌گیرند که این باعث نتایج آماری نادرست می‌شود. در این مقاله بعد از معرفی داده‌های ترکیبی و خاصیت بسته بودن آنها، تبدیل‌هایی برای باز کردن سیستم بسته داده‌ها معرفی شده‌اند. این تبدیل‌ها عبارت‌اند از تبدیل نسبت لگاریتمی جمعی، تبدیل نسبت لگاریتمی مرکزی شده و تبدیل نسبت لگاریتمی ایزومتریک که همگی برحسب لگاریتم نسبت‌ها تعریف می‌شوند. بعد از معرفی و برشمردن مزایا و معایب این‌ تبدیل‌ها نسبت به ‌همدیگر، یکی از آنها به نام تبدیل clr روی یک مجموعه داده مربوط به آنالیز شیمیایی خاک اعمال شده‌ است. بعلاوه نتایج اعمال تحلیل خوشه‌ای بر داده‌های تبدیل شده با استفاده از ماتریس ضرایب همبستگی اسپیرمن به عنوان ماتریس فاصله مورد بررسی قرار گرفته است. همچنین تأثیر اعمال تبدیل clr بر حذف همبستگی کاذب، تعدیل چولگی و نقاط پرت در داد‌ه‌ها با کمک برخی نمودارهای آماری و با استفاده از نرم‌افزار آماری R بررسی شده است.

چکیده انگلیسی:

In the field of earth sciences, measurements typically yield compositional data that has a property known as closedness. The application of common statistical methods to compositional data results in the exclusion of spurious correlations, which in turn yields findings that are not representative of the underlying data. This article presents a set of transformations for the opening of closed systems of compositional data. These transformations include the additive logarithmic ratio (alr), the centered logarithmic ratio (clr), and the isometric logarithmic ratio (ilr). All of the aforementioned transformations are defined in terms of logarithms of ratios. The clr transformation was then applied to a soil chemical data set. The results of applying cluster analysis on the clr-transformed data were also analyzed using Spearman's correlation coefficient matrix as distance. Furthermore, the impact of the clr transformation on spurious correlations, skewness, and outliers in the data was evaluated using R statistical software.

منابع و مأخذ:

اعلمی نیا، ز.، منصوری اصفهانی، م.، طباطبايی، س. ح. و بختیاری، ن. م.، 1397. شناسایی و پی‌جویی ناهنجاری‌های زمین‌شناسی همراه با کانی‌سازی مس در چهارگوش 1:100000 نطنز (شمال اصفهان)، ایران. بلور‌شناسی و کانی‌شناسی ایران، (۳)26، 625-634.
- حسین پور نجاتی، س.، سیاه چشم، ک.، علوی، س. غ. و زرگری، پ.، ۱۴۰۰. تحلیل پتانسیل کانیزایی با استفاده از روش تحلیل فاکتوری مرحله‌ای (SFA) در گستره خوشنامه، هشجین، استان اردبیل. فصلنامه زمین‌شناسی ایران، 57، 13-1.
-حیدریان دهکردی، ن.، توکل، م. ح. و پورمحمدی، س.، 1396. پتانسیل سنجی رسوبات آبراهه‌ای منجیل با استفاده از GIS . فصلنامه زمین‌شناسی ایران، 43، 108-95.
-محمدی اصل، ز.، سعيدی، ع.، آرین، م.، سلگي ع. و فرهادي نژاد، ط.، ۱۳۹۹. جداسازي آنومالي‌هاي ژئوشيميايي از زمينه با استفاده از روش فرکتالي عيار-تعداد در محدوده وشنوه (جنوب قم). فصلنامه زمین‌شناسی ایران، 53، 73-61.
- Aitchison, J., 1986. The Statistical Analysis of Compositional Data, Chapman and Hall/CRC, New York.‎
- Chayes, F., 1960. On correlation between variables of constant sum. Journal of Geophysical Research, 65(12), 4185–4193.
- Egozcue, J.J. and Pawlowsky-Glahn, V., ‎2005. Groups of parts and their balances in compositional data analysis. Mathematical Geology, 37, 795–828.
- Egozcue, J.J., Pawlowsky-Glahn, V., Mateu-Figueras, G. and Barceló-Vidal, C., 2003. Isometric logratio transformations for compositional data analysis. Mathematical Geology, 35, 279-300.
- Filzmoser, P. and Hron. K., 2008. Outlier detection for compositional data using robust methods. Mathematical Geosciences, 40, 233-248.
- Filzmoser, P. and Hron, K, 2009. Correlation analysis for compositional data. Mathematical Geosciences, 41(9), 905-919.
- Filzmoser, P., Hron, K. and Reimann, C., 2009. Univariate statistical analysis of environmental (compositional) data: problems and possibilities. Science of the Total Environmen, 407, 6100–6108.
- Filzmoser, P., Horn, K. and Templ, M., 2018. Applied Compositional Data Analysis with Worked Examples in R. Springer, ‎Switzerland‎.
- Gerald van den Boogaart, K. and Tolosana-Delgado, R., 2013. Analyzing Compositional Data with R. Springer, New York.
- Miesch, A.T. and Chapman, R. P., 1977. Log-transformation in geochemistry. Mathematical Geology, 9(2), 191-194.
- Pearson, K., 1897. Mathematical contributions to the theory of evolution. On a form of spurious correlation which may arise when indices are used in the measurement of organs. Proceedings of the Royal Society of London, 60, 489-498.‎
- Pendleton, B. F., Newman, I. and Marshall, R. S., 1983. A Monte Carlo approach to correlation spuriousness and ratio variables. Statist Comput Simul, 18, 93-124.
- Reimann, C. and Filzmoser, P., 2000. Normal and lognormal data distribution in geochemistry: death of a myth. Consequences for the statistical treatment of geochemical and environmental data. Environmental Geology, 39, 1001–1014.
- Reimann, C., Filzmoser, P., Garrett, R. and Dutter, R., 2008. Statistical Data Analysis Explained - Applied Environmental Statistics with R. John Wiley and Sons, London.
- Reimann, C., Filzmoserand, P., Hron, K., Kynčlová P. and Garrett, R., 2017. A new method for correlation analysis of compositional (environmental) data – a worked example. Science of the Total Environment, 607–608, 965–971.

مقالات مرتبط

محتوای مبانی هنرهای تجسمی؛ تحلیلی مبتنی بر مولفه¬های کارآفرینی
تاریخ چاپ : 1403/03/09
رابطه جهانی شدن علمی آموزش عالی با دستاوردهای تحصیلی دانشجویان دانشگاه سمنان
تاریخ چاپ : 1399/09/11
شناسایی و بررسی عوامل سازمانی برای پیاده سازی نوآوری باز در دانشگاه های ایران
تاریخ چاپ : 1399/09/11
شناسایی و اولویت بندی عوامل موثر بر تسهیم دانش بر اساس تحلیل سلسله مراتبی(AHP) (مطالعه موردی گروه صنعتی سایپا)
تاریخ چاپ : 1399/03/27
شناسایی و اولویت بندی عوامل موثر بر تسهیم دانش بر اساس تحلیل سلسله مراتبی(AHP) (مطالعه موردی گروه صنعتی سایپا)
تاریخ چاپ : 1399/03/27
طراحی مدلی به منظور تحلیل سطح همکاری دانشگاه و صنعت با استفاده از مدل سازی ساختاری تفسیری(ISM)
تاریخ چاپ : 1398/08/17

اشتراک گذاری

آدرس مقاله

همبستگی کاذب و خاصیت بسته بودن داده‌های ترکیبی در علوم زمین