About all

Alt normal values: High, Low & Normal Results, Symptoms & Causes

Alanine Aminotransferase (ALT) Test | HealthLink BC

Topic Contents

  • Test Overview
  • Why It Is Done
  • How To Prepare
  • How It Is Done
  • How It Feels
  • Risks
  • Results
  • Related Information
  • Credits

Test Overview

An alanine aminotransferase (ALT) test measures the amount of this enzyme in the blood. ALT is found mainly in the liver, but also in smaller amounts in the kidneys, heart, muscles, and pancreas. ALT was formerly called serum glutamic pyruvic transaminase (SGPT).

ALT is measured to see if the liver is damaged or diseased. Low levels of ALT are normally found in the blood. But when the liver is damaged or diseased, it releases ALT into the bloodstream, which makes ALT levels go up. Most increases in ALT levels are caused by liver damage.

The ALT test is often done along with other tests that check for liver damage, including aspartate aminotransferase (AST), alkaline phosphatase, lactate dehydrogenase (LDH), and bilirubin. Both ALT and AST levels are reliable tests for liver damage.

Why It Is Done

The ALT test is done to:

  • Identify liver disease, such as cirrhosis and hepatitis, caused by alcohol, drugs, or viruses.
  • Help check for liver damage.
  • Find out whether jaundice was caused by a blood disorder or liver disease.
  • Keep track of the effects of cholesterol-lowering medicines and other medicines that can damage the liver.

How To Prepare

Avoid strenuous exercise just before having an ALT test.

How It Is Done

A health professional uses a needle to take a blood sample, usually from the arm.

How long the test

takes

The test will take a few minutes.

How It Feels

When a blood sample is taken, you may feel nothing at all from the needle. Or you might feel a quick sting or pinch.

Risks

There is very little chance of having a problem from this test. When a blood sample is taken, a small bruise may form at the site.

Results

Each lab has a different range for what’s normal. Your lab report should show the range that your lab uses for each test. The normal range is just a guide. Your doctor will also look at your results based on your age, health, and other factors. A value that isn’t in the normal range may still be normal for you.

Results are usually available within 12 hours.

High values

High levels of ALT may be caused by:

  • Liver damage from conditions such as hepatitis or cirrhosis.
  • Lead poisoning.
  • Very strenuous exercise or severe injury to a muscle.
  • Exposure to carbon tetrachloride.
  • Decay of a large tumour (necrosis).
  • Many medicines, such as statins, antibiotics, chemotherapy, aspirin, opioids, and barbiturates.
  • Mononucleosis.
  • Growth spurts, especially in young children. Rapid growth can cause mildly elevated levels of ALT.

Credits

  1. Kidneys
  2. Pancreas
  • About This Page

  • General Feedback

  • Email Link

  • Physical Activity Services

We appreciate your feedback. Comments submitted through the form below can help us fix errors in page content, get rid of interface bugs, and update the HealthLinkBC website to better suit the needs of the people who use it.

To submit feedback about this web page, please enter your comments, suggestions, compliments or questions in the form below. To submit general feedback about the HealthLink BC website, please click on the General Feedback tab.

Page

Content

Functionality

Message:

Your name:

Your email:

To submit general feedback about the HealthLink BC website, please enter your comments, suggestions, compliments or questions in the form below. To submit feedback about a specific web page, please click on the About This Page tab.

Please note that we are unable to provide general health information or advice about symptoms by email. For general health information or symptom advice, please call us at 8-1-1 any time of the day or night.

For questions about food and nutrition, please click on Email a HealthLinkBC Dietitian.

What is your message about?
— Select –8-1-1 Telephone ServicesBC Health Service Locater AppBrand Name Food List (BNFL)Website ContentTechnical ProblemsPrint media requirements / Web buttonsOther

Message:

Your name:

Your email:

Alanine aminotransferase (ALT) Test

Also Known As

Serum glutamic-pyruvic transaminase (SGPT)

At a Glance

Why Get Tested?

To screen for liver disease

When To Get Tested?

If your doctor thinks that you have symptoms of a liver disorder

Sample Required?

A blood sample will be taken from a vein in the arm

Test Preparation Needed?

No test preparation is needed, although you should inform your doctor about any drugs you are taking

On average it takes 7 working days for the blood test results to come back from the hospital, depending on the exact tests requested. Some specialist test results may take longer, if samples have to be sent to a reference (specialist) laboratory. The X-ray & scan results may take longer. If you are registered to use the online services of your local practice, you may be able to access your results online. Your GP practice will be able to provide specific details.

If the doctor wants to see you about the result(s), you will be offered an appointment. If you are concerned about your test results, you will need to arrange an appointment with your doctor so that all relevant information including age, ethnicity, health history, signs and symptoms, laboratory and other procedures (radiology, endoscopy, etc.), can be considered.

Lab Tests Online-UK is an educational website designed to provide patients and carers with information on laboratory tests used in medical care. We are not a laboratory and are unable to comment on an individual’s health and treatment.

Reference ranges are dependent on many factors, including patient age, sex, sample population, and test method, and numeric test results can have different meanings in different laboratories.

For these reasons, you will not find reference ranges for the majority of tests described on this web site. The lab report containing your test results should include the relevant reference range for your test(s). Please consult your doctor or the laboratory that performed the test(s) to obtain the reference range if you do not have the lab report.

For more information on reference ranges, please read Reference Ranges and What They Mean.

What is being tested?

ALT is an enzyme found mostly in the liver; smaller amounts are also found in the kidneys, heart and muscles. When the liver is damaged, ALT is released into the bloodstream, hence increasing the concentration that can be detected in a blood test. This often happens before more obvious symptoms of liver damage occur, such as jaundice (yellowing of the eyes and skin).

See More

See Less

Accordion Title

Common Questions

  • How is it used?

    The ALT blood test detects liver injury. ALT results are usually assessed alongside the results of other blood tests such as alkaline phosphatase (ALP), gamma-glutamyl transferase (GGT) and aspartate aminotransferase (AST) to help determine which form of liver disease is present.

  • When is it requested?

    A doctor usually requests an ALT test with other laboratory investigations to evaluate a patient who has symptoms of a liver disorder. ALT is used to identify liver damage. Some of these symptoms include jaundice, dark urine, nausea, vomiting, abdominal swelling, unusual weight gain and abdominal pain. ALT can also be used, either by itself or with other tests, for patients at risk of developing liver disease such as:

    • persons who have a history of known or possible exposure to hepatitis viruses
    • those who drink too much alcohol
    • those whose family have a history of liver disease
    • people who take drugs that might damage the liver
    • those who are overweight or who have diabetes

    In people with mild symptoms, such as tiredness or loss of energy, ALT may be tested to make sure they do not have chronic (long-term) liver disease. ALT is often used to monitor the treatment of persons who have liver disease to see if the treatment is working and may be requested either by itself or along with other blood tests.

  • What does the test result mean?

    Very high concentrations of ALT (more than 10 times the highest normal level) are usually due to acute (short-term) hepatitis, often due to a viral infection. In acute hepatitis, the concentration of ALT usually stays high for about 1–2 months but can take as long as 3–6 months to return to normal. 

    ALT concentrations are usually not as high in chronic hepatitis, often less than 4 times the highest normal level. In this case, ALT concentrations often vary between normal and slightly increased, so doctors may request the test frequently to see if there is a pattern. A moderately high ALT can also occur when there is a high alcohol intake, diabetes or raised serum triglycerides, all of which can cause fatty liver.

    In some liver diseases, especially when the bileducts are blocked (cholestasis), when a person has cirrhosisn or when liver cancer is present, the concentration of ALT may be close to normal.

  • Is there anything else I should know?

    An injection of medicine into the muscle tissue or strenuous exercise may increase ALT concentration as it is released into the bloodstream from muscle.

    Certain drugs may cause liver damage, resulting in high ALT concentrations. This occurs in a very small percentage of patients and is true of both prescription drugs and some ‘natural’ health products. If your doctor finds that you have a high ALT, tell them about all the drugs and health products you are taking.

  • What is hepatitis?

    Hepatitis is an inflammation of the liver. There are two major forms: acute and chronic. Acute hepatitis is a fast-developing disease and typically makes affected persons feel sick, as if they have the flu, often with loss of appetite and sometimes diarrhoea and vomiting. In many cases, acute hepatitis causes dark urine, pale stools and yellowing of the skin and eyes (jaundice). Most affected individuals eventually recover completely. Chronic (long-term) hepatitis usually causes no symptoms or causes only loss of energy and tiredness; most people don’t know that they have it. In some people, chronic hepatitis can gradually damage the liver and, after many years, cause it to fail.

  • What are the other liver tests?

    Other commonly used liver tests include other enzymes found in liver cells, such as aspartate aminotransferase (AST) and alkaline phosphatase as well as bilirubin, which is a breakdown product from red blood cells removed from the body by the liver and spleen. The doctor will often order these tests together as a group and refer to them as ‘liver function tests’. Albumin is also frequently included in the liver function test profile because, as albumin is produced by the liver, it can be used as a measure of liver protein synthesis. However, other factors can affect the concentration of albumin in the blood such as poor nutrition or excessive loss from the gut or kidney.

See More Common Questions
See Less Common Questions

Explaining p-Values ​​for Beginner Data Scientists sigma” (meaning having a p-value of 0.0000003).

Back then, I didn’t know anything about p-value, hypothesis testing, or even statistical significance.

I decided to google the word “p-value” and what I found on Wikipedia made me even more confused…

When testing statistical hypotheses , the p-value of or , the probability value of for a given statistical model, is the probability that, when the null hypothesis is true, the statistical summary (for example, the absolute value of the sample mean difference between two compared groups) will be greater than or equal to actual observed results.
— Wikipedia

Good job, Wikipedia.

Okay. I didn’t understand what p-value actually means.

As I delved into the realm of data science, I finally began to understand the meaning of the p-value and where it can be used as part of the decision-making tools in certain experiments.

So I decided to explain the p-value in this article and how it can be used in hypothesis testing to give you a better and more intuitive understanding of p-values.

Also we can’t skip the fundamental understanding of other concepts and the definition of p-value, I promise I will make this explanation intuitive without exposing you to all the technical terms I’ve come across.

There are four sections in this article to give you a complete picture from building a hypothesis test to understanding the p-value and using it in decision making. I highly recommend that you go through all of them to get a detailed understanding of p-values:

  1. Hypothesis Test
  2. Normal distribution
  3. What is a P-value?
  4. Statistical significance

It will be fun.

Let’s get started!

1. Hypothesis Testing

Before we talk about what p-value means, let’s start by looking at hypothesis testing, where p-value is used to determine the statistical significance of our results.

Our final goal is to determine the statistical significance of our results.

And statistical significance is built on these 3 simple ideas:

  • Hypothesis testing
  • Normal distribution
  • P-value

Hypothesis testing is used to test the validity of a statement (null hypothesis) made about a population using sample data. The alternative hypothesis is the one you would believe if the null hypothesis were wrong.

In other words, we will create a statement (null hypothesis) and use the sample data to check if the statement is valid. If the statement is not true, we will choose an alternative hypothesis. Everything is very simple.

To find out if a claim is valid or not, we will use the p-value to weight the strength of the evidence to see if it is statistically significant. If the evidence supports the alternative hypothesis, then we will reject the null hypothesis and accept the alternative hypothesis. This will be explained in the next section.

Let’s use an example to make this concept clearer, and this example will be used throughout this article for other concepts.

Example. Suppose a pizzeria claims to have an average delivery time of 30 minutes or less, but you think it’s longer than advertised. So you do a hypothesis test and randomly choose a delivery time to test the assertion:

  • Null hypothesis – Average delivery time is 30 minutes or less
  • Alternative hypothesis – average delivery time exceeds 30 minutes
  • The goal here is to determine which statement, null or alternative, is better supported by the data obtained from our sample data.

We will use a one-sided test in our case, since we only care that the average delivery time exceeds 30 minutes. We will not consider this possibility in the other direction, since the consequences of an average delivery time of less than or equal to 30 minutes are even more preferable. Here we want to check if there is a chance that the average delivery time is greater than 30 minutes. In other words, we want to see if the pizzeria has cheated us.

One common way to test hypotheses is to use the Z-test. We won’t go into details here, as we want to get a better understanding of what’s going on on the surface before diving deeper.

2. Normal distribution

The normal distribution is a probability density function used to view the distribution of data.

The normal distribution has two parameters, the mean (μ) and the standard deviation, also called sigma (σ).

The mean is the central trend of the distribution. It specifies the location of the peak for normal distributions. Standard deviation is a measure of variability. It determines how far from the mean the values ​​tend to fall.

The normal distribution is usually associated with the 68-95-99.7 rule (image above).

  • 68% of data are within 1 standard deviation (σ) of the mean (μ)
  • 95% of data are within 2 standard deviations (σ) of the mean (μ)
  • 99. 7% of data are within 3 standard deviations (σ) of the mean (μ)

Remember the five sigma threshold for discovering the Higgs boson that I talked about at the beginning? 5 sigma is about 99.99999426696856% of the data that must be in before scientists confirm the discovery of the Higgs boson. This was a strict threshold set to avoid any possible false signals.

Cool. Now you might be wondering, “How does the normal distribution relate to our previous hypothesis testing?”

Since we used the Z-test to test our hypothesis, we need to calculate the Z-scores (to be used in our test statistics), which are the number of standard deviations from the mean of the data point. In our case, each data point is the pizza delivery time we got.

Note that when we calculated all Z-scores for each pizza delivery time and plotted the standard bell curve as shown below, the unit on the x-axis will change from minutes to standard deviation since we standardized the variable, subtracting the mean and dividing it by the standard deviation (see formula above).

Examining the standard bell curve is useful because we can compare test results with a “normal” population with a standardized unit in standard deviation, especially when we have a variable that comes with different units.

Z-score can tell

I like how Will Kersen put it: the higher or lower the Z-score, the less likely a random outcome will be, and the more likely a significant outcome will be.0007

But how high (or low) is considered convincing enough to quantify how meaningful our results are?

Climax

Here we need the last piece to solve the puzzle, the p-value, and check if our results are statistically significant based on the significance level (also known as alpha) that we set before starting our experiment.

3. What is a P-value?


Finally… Here we are talking about the p-value!

All of the previous explanations are meant to set the stage and get us to this P-value. We need the previous context and steps to understand this mysterious (actually not so mysterious) p-value and how it can lead to our hypothesis testing solutions.

If you’ve gotten this far, keep reading. Because this section is the most exciting part of all!

Instead of explaining p-values ​​using Wikipedia’s definition (sorry Wikipedia), let’s explain it in our context – pizza delivery time!

Recall that we randomly sampled some pizza delivery times and the goal is to check if the delivery time is more than 30 minutes. If the definitive evidence supports the pizzeria’s claim (average delivery time is 30 minutes or less), then we will not reject the null hypothesis. Otherwise, we reject the null hypothesis.

So the task of the p-value is to answer this question:

If I live in a world where pizza delivery times are 30 minutes or less (null hypothesis true), how surprising is my evidence in real life?

The P-value answers this question with a number – a probability.

The lower the p value, the more surprising the evidence is, the more ridiculous our null hypothesis looks.

And what do we do when we feel ridiculous about our null hypothesis? We reject it and choose our alternative hypothesis.

If the p-value is below a given level of significance (people call it alpha, I call it the ridiculousness threshold – don’t ask me why, it’s just easier for me to understand), then we reject the null hypothesis.

Now we understand what the p-value means. Let’s apply this to our case.

P-value in pizza delivery time calculation

Now that we have collected some sample delivery time data, we ran the calculation and found that the average delivery time is 10 minutes longer with a p-value of 0.03.

This means that in a world where pizza delivery times are 30 minutes or less (the null hypothesis is true), there is a 3% chance that we will see average delivery times at least 10 minutes longer due to random noise.

The smaller the p-value, the more significant the result will be because it is less likely to be caused by noise.

In our case, most people misunderstand the p-value:

A p-value of 0.03 means there is a 3% (percent chance) that the result is due to chance – which is not true.

People often want a certain answer (myself included), which is why I’ve been confused about the interpretation of p-values ​​for a long time.

The p-value *proves* nothing. It’s just a way of using surprise as the basis for making a smart decision.
— Cassie Kozyrkov

Here’s how we can use a p-value of 0.03 to help us make an intelligent decision (IMPORTANT):

  • Imagine we live in a world where the average delivery time is always 30 minutes or less — because we believe in the pizzeria (our original belief)!
  • After analyzing the delivery time of the collected samples, the p-value is 0.03 lower than the 0.05 significance level (assuming we set this value before our experiment), and we can say that the result is statistically significant.
  • Since we have always believed the pizzeria that they can fulfill their promise to deliver pizza in 30 minutes or less, we now need to consider whether this belief makes sense, since the result tells us that the pizzeria does not fulfill its promise and the result is statistically significant .
  • So what do we do? First, we try to think of every possible way to make our initial belief (the null hypothesis) true. But as the pizzeria slowly gets bad reviews from other people and often makes bad excuses that led to delivery delays, even we ourselves feel ridiculous to justify the pizzeria and hence we decide to reject the null hypothesis.
  • Finally, the next smart decision is not to buy more pizza from this place.

By now, you may have figured something out… Depending on our context, p-values ​​are not used to prove or justify anything.

In my opinion p-values ​​are used as a tool to challenge our initial belief (the null hypothesis) when the result is statistically significant. The moment we feel ridiculous with our own belief (assuming the p-value indicates that the result is statistically significant), we discard our original belief (reject the null hypothesis) and make a reasonable decision.

4. Statistical significance

Finally, this is the last step where we put everything together and check if the result is statistically significant.

It’s not enough to just have a p-value, we need to set a threshold (significance level – alpha). Alpha should always be set before an experiment to avoid bias. If the observed p-value is lower than alpha, then we conclude that the result is statistically significant.

The rule of thumb is to set alpha to 0.05 or 0.01 (again, the value depends on your task).

As mentioned earlier, suppose we set alpha to 0.05 before we started the experiment, the result is statistically significant because the p-value of 0.03 is lower than alpha.

For reference, below are the main steps of the whole experiment:

  1. Formulate the null hypothesis
  2. Formulate an alternative hypothesis
  3. Define alpha value to use
  4. Find the Z-score associated with your alpha level
  5. Find test statistic using this formula
  6. If the value of the test statistic is less than the Z-score of the alpha level (or the p-value is less than the alpha value), reject the null hypothesis. Otherwise, do not reject the null hypothesis.

If you want to know more about statistical significance, feel free to check out this article – An Explanation of Statistical Significance written by Will Kersen.

Follow-up reflections

There’s a lot to digest here, right?

I can’t deny that p-values ​​are inherently confusing to a lot of people and it took me quite some time to really understand and appreciate the meaning of p-values ​​and how they can be applied within our process decision making as data scientists.

But don’t rely too much on p-values, as they only help a small part of the whole decision-making process.

I hope my explanation of p-values ​​has become intuitive and helpful in your understanding of what p-values ​​really mean and how they can be used to test your hypotheses.

Calculating p-values ​​by itself is simple. The tricky part comes when we want to interpret p-values ​​in hypothesis testing. I hope that now the difficult part becomes a little easier for you.

If you want to learn more about statistics, I strongly recommend that you read this book (which I’m currently reading!) – Practical Statistics for Data Scientists, specially written for data scientists to understand the fundamental concepts of statistics.

Find out the details of how to get a sought-after profession from scratch or Level Up in skills and salary by completing paid online courses SkillFactory:

  • Data Science training from scratch (12 months)
  • Analyst profession with any starting level (9 months)
  • Machine Learning Course (12 weeks)
  • Python for Web Development course (9 months)
  • DevOps course (12 months)
  • Profession Web developer (8 months)

Read more

  • Data Science Trends 2020
  • Data Science is dead. Long live Business Science
  • Cool Data Scientist don’t waste time on statistics
  • How to become a Data Scientist without online courses
  • 450 Free Ivy League Courses
  • Data Science for the Humanities: What is “data”
  • Data Science on steroids: an introduction to Decision Intelligence

What is a z-score? What is a p-value?—ArcGIS Pro

Most statistical tests begin with a null hypothesis. The null hypothesis for the structural pattern analysis tools (Structural Pattern Analysis and Cluster List toolset) is complete spatial randomness (SRC) or the objects themselves or the values ​​associated with them. Z-scores and p-values ​​obtained as a result of the analysis of structural patterns indicate whether the null hypothesis can be rejected or not. Typically, you run one of the structural pattern analysis tools with the assumption that the z-score and p-value will be indicative of a possible null hypothesis refutation. This will tell you that your features, or the values ​​associated with them, exhibit statistically significant clustering or variance. Whenever you see spatial structure, such as landscape (or spatial data) clustering, you are seeing evidence of some basic spatial processes at work, and as a geographer or GIS analyst, this may interest you the most.

p-value is a probability. For structural pattern analysis, this is the probability that the observed spatial patterns were created by some random process. When the p-value is very small, it means that it is very unlikely (small probability) that the observed spatial patterns are the result of random processes, so the null hypothesis can be rejected. You may ask the question: How small is an object really small? Good question. See table and discussion below.

Z-scores are standard deviations. If, for example, the tool returns a z-score of +2.5, you would say the result is 2.5 standard deviations. Both z-scores and p-values ​​are associated with the standard normal distribution, as shown below.

Very high or very low (negative) z-scores associated with very small p-values ​​are at the tails of a normal distribution. When you run a structural pattern analysis tool and it results in small p-values ​​or very high or very low z-scores, it indicates that the observed spatial pattern is unlikely to reflect the theoretical random structural pattern represented by your null hypothesis.

To reject the null hypothesis, you must make a subjective judgment about the level of risk you are willing to accept in order to be wrong (in order to falsely reject the null hypothesis). Therefore, before you run a spatial statistical process, you choose a confidence level. Typical confidence levels are 90, 95, or 99 percent. A 99 percent confidence level would be the most conservative in this case, indicating that you are not willing to reject the null hypothesis unless the probability that the model was generated by a random process is really small (less than a 1 percent chance).

Confidence levels

The table below shows the unadjusted critical p-values ​​and z-scores for various confidence levels.

Tools that enable FDR will use adjusted critical p-values. These critical values ​​will be the same or less than those shown in the table below.

z-score (standard deviations) p-values ​​(Probability) Confidence level

< -1. 65 or > +1.65

< 0.10

900 06 90%

< -1.96 or > + 1.96

< 0.05

95%

< -2.58 or > +2.58

903 15

< 0.01

99%

Consider an example. Critical z-score values ​​using 9The 5 percent confidence level is -1.96 and +1.96 standard deviations. The unadjusted p-value associated with the 95 percent confidence level is 0.05. If the z-score is between -1.96 and +1. 96, then the unadjusted p-value will be greater than 0.05, and you will not be able to reject the null hypothesis, as the pattern shown may likely be the result of random spatial processes. If the z-score falls outside that range (for example, -2.5 or +5.4 standard deviations), the observed spatial pattern is likely too unusual to be the result of a random process, and the p-values ​​will be small to reject this. In this case, it is possible to reject the null hypothesis and proceed with finding out what might be causing the statistically significant spatial structure in your data.

The key idea here is that values ​​in the middle of a normal distribution (z-scores such as 0.19 or -1.2, for example) represent the expected result. When the absolute value of the z-score is large and the probabilities are small (in the tails of the normal distribution), however, you see something unusual and generally very interesting. For the Hot Spot Analysis tool, for example, “unusual” means a statistically significant “hot” or “cold” spot.

FDR correction

Local spatial pattern analysis tools, including Hot Spot Analysis and Cluster and Outlier Analysis (Anselin Local Moran’s I) offer an additional option, Apply FDR correction. When this setting is enabled, FDR Correction lowers the critical p-value threshold shown in the table above for use in multiple testing and spatial dependency. The reduction, if any, is a function of the number of input objects and the environment structure used.

Local spatial pattern analysis tools work by looking at each feature in the context of surrounding features and determine if the local pattern (the target and its surroundings) is different from the global pattern (all features in the dataset). The results of the z-score calculations and p-values ​​associated with each item allow you to determine whether the difference is statistically significant or not. This analytical approach creates certain difficulties in multiple testing and studying dependencies.

Multiple testing – with a 95 percent confidence level, probability theory says that there are only 5 chances in 100 that the spatial pattern can be structured (clustered or dispersive, for example) and can be associated with a statistically significant p-value when in fact, the spatial processes that create this pattern are random. In this case, we incorrectly reject the null hypothesis based on statistically significant p-values. Five chances out of 100 sounds pretty convincing until you realize that local spatial statistics performs a test on every feature in the dataset. For example, if there are 10,000 objects, we may get up to 500 erroneous results.

Spatial dependency – closely spaced objects tend to be similar; they are more likely than non-spatial data to show this type of dependency. However, many statistical tests require that the objects be independent. This is necessary for local pattern analysis tools because spatial dependence can artificially smooth out statistical significance. Spatial dependence is exacerbated by local pattern analysis tools, since each feature is evaluated in the context of its neighborhood, and closely spaced features will have many of the same neighborhoods. This coincidence emphasizes spatial dependence.

At least three 3 approaches are used to handle problems with multiple test and spatial dependencies. The first approach is to ignore the problem, given that the individual test performed on each object in the dataset must be considered separately from the others. However, with this approach, it is very likely that some statistically significant results will be incorrect (look statistically significant if the underlying spatial processes are random). The second approach is to apply the classic multiple testing procedure, such as the Bonferroni correction or the Sidac correction. However, these methods are usually too conservative. Although they significantly reduce the number of false positives, they also miss the available statistically significant results. A third approach is to apply an FDR correction that estimates the number of false positives for a given confidence level and adjusts the critical p-value accordingly. With this method, statistically significant p-values ​​are ranked from the smallest (most stringent) to the largest (least stringent), based on the evaluation of false positives, the least stringent are removed from the list. The remaining features with statistically significant p-values ​​are determined by the Gi_Bin or COType fields in the output feature class. While not ideal, this method has been shown to perform better in empirical tests than performing each test individually or using traditional, often overly conservative, multiple test methods. See the Additional Resources section for more information on FDR correction.

Null Hypothesis and Spatial Statistics

Some statistics tools in the Spatial Statistics Toolbox are inferred methods for spatial analysis of structural patterns, such as Spatial Autocorrelation (Global Moran’s I), Cluster and Outlier Analysis (Anselin Local Moran’s I), and Hot Spot Analysis. points (Getis-Ord Gi*). Logically derived statistical indicators are justified in the theory of probability. Probability is a measure of randomness, and underlying all statistical tests (either directly or indirectly) are probability calculations that evaluate the role of chance in the outcome of your analysis. Typically with traditional (non-spatial) statistics, you are working with a random sample and trying to determine the probability that your data sample is a good representation (reflexively) of the population as a whole. As an example, you might ask, “What are the chances that the results of my voter poll (showing Candidate A will slightly outperform Candidate B) reflect the final election results?” But in most cases, when working with spatial statistics, including the spatial autocorrelation mentioned above, you typically use all the data that is available in the study area (all crimes, all cases of illness, attributes for each census tract, and so on). When you calculate a statistic for the entire population, you no longer have an estimate at all. You have a fact. Therefore, there is no longer any point in talking about similarities or probabilities. So how can spatial pattern analysis tools, often applied to all data in a study area, legitimately report probabilities? The answer is that they can do this by postulating through the null hypothesis that the data, in fact, are part of some larger population. Let’s consider this in more detail.

Null hypothesis randomization – Where appropriate, the tools in the Spatial Statistics Toolbox use null hypothesis randomization as the basis for a statistical significance test. The null hypothesis randomization postulates that the observed spatial pattern of your data represents one of many (n!) possible spatial organization of the data. If you could collect data values ​​and drop them on objects in your study area, you would have one possible spatial arrangement of those values. (Note that picking up your data values ​​and randomly throwing them around is an example of a random spatial process.) The null hypothesis randomization states that if you could do this exercise (pick them up and drop them) an infinite number of times, in most cases you would create a structure that would not noticeably differ from the observed structure (your real data). Sometimes you might accidentally throw all the highest values ​​into the same corner of your ROI, but the chances of this happening are small. The null hypothesis randomization states that your data is one of many, many, many possible versions of complete spatial randomness. The data values ​​are fixed; only their spatial organization could change.

Null Hypothesis Normalization – The general alternative null hypothesis not implemented for the Spatial Statistics Toolkit is the null hypothesis normalization. Normalization of the Null Hypothesis postulates that the observed values ​​are obtained from an infinitely large, normally distributed population through some random sampling process. With different samples, you would get different values, but you would still expect those values ​​to be representative of the larger distribution. The null hypothesis normalization states that the values ​​represent one of many possible samples of values. If you could fit your observed data to a normal curve and randomly sample values ​​from that distribution to throw at your area of ​​study, most of the time you will produce a pattern and distribution of values ​​that are not noticeably different from the observed sample/distribution (your real data).