ALT Normal Values: Understanding High, Low, and Normal Results, Symptoms, and Causes
What are the normal ALT levels in blood. How to interpret high and low ALT test results. What are the common causes of abnormal ALT levels. What symptoms may indicate liver problems related to ALT.
What is ALT and Why is it Important?
Alanine aminotransferase (ALT) is an enzyme primarily found in the liver. It plays a crucial role in amino acid metabolism and is an important indicator of liver health. When liver cells are damaged, ALT is released into the bloodstream, causing elevated levels in blood tests.
ALT testing is essential for several reasons:
- Detecting liver disease or damage
- Monitoring the progression of liver conditions
- Evaluating the effectiveness of treatments
- Assessing potential side effects of medications
Understanding ALT levels can provide valuable insights into overall liver function and help healthcare providers make informed decisions about patient care.
Normal ALT Levels: What’s Considered Healthy?
Normal ALT levels can vary depending on factors such as age, gender, and the specific laboratory conducting the test. However, general guidelines for normal ALT levels are:
- Men: 7 to 55 units per liter (U/L)
- Women: 7 to 45 U/L
It’s important to note that these ranges may differ slightly between laboratories. Always consult with your healthcare provider for the most accurate interpretation of your test results.
Factors Affecting Normal ALT Levels
Several factors can influence ALT levels, even within the normal range:
- Age: ALT levels tend to decrease slightly with age
- Gender: Men typically have higher ALT levels than women
- Body mass index (BMI): Higher BMI is associated with slightly elevated ALT levels
- Ethnicity: Some studies suggest variations in ALT levels among different ethnic groups
- Time of day: ALT levels may fluctuate throughout the day
High ALT Levels: Causes and Implications
Elevated ALT levels often indicate liver damage or disease. Some common causes of high ALT include:
- Hepatitis (viral, alcoholic, or autoimmune)
- Nonalcoholic fatty liver disease (NAFLD)
- Cirrhosis
- Liver cancer
- Medications (e.g., statins, antibiotics, acetaminophen)
- Alcohol abuse
- Obesity
When ALT levels are significantly elevated, it may indicate acute liver injury or severe liver disease. In such cases, further testing and medical evaluation are crucial to determine the underlying cause and appropriate treatment.
Symptoms Associated with High ALT Levels
While elevated ALT levels themselves don’t cause symptoms, the underlying liver conditions may present with various signs:
- Jaundice (yellowing of skin and eyes)
- Fatigue
- Abdominal pain or swelling
- Nausea and vomiting
- Dark urine
- Pale stools
- Unexplained weight loss
Low ALT Levels: Are They a Concern?
While high ALT levels are often a cause for concern, low ALT levels are generally not considered clinically significant. In some cases, very low ALT levels may be associated with:
- Vitamin B6 deficiency
- Severe liver damage (in advanced stages of liver disease)
- Chronic kidney disease
However, it’s important to note that low ALT levels alone are not typically used as a diagnostic tool. If you have concerns about low ALT levels, consult your healthcare provider for a comprehensive evaluation.
ALT Test: Procedure and Preparation
The ALT test is a simple blood test that requires no special preparation. Here’s what you can expect:
- A healthcare professional will clean the area, usually on your arm, where the blood will be drawn.
- A small needle will be inserted into a vein to collect a blood sample.
- The sample is sent to a laboratory for analysis.
- Results are typically available within a few days.
While fasting is not usually required for an ALT test, your doctor may recommend avoiding strenuous exercise before the test, as it can temporarily elevate ALT levels.
Are there any risks associated with the ALT test?
The ALT test is generally safe with minimal risks. Some people may experience slight discomfort or bruising at the needle insertion site. In rare cases, complications such as excessive bleeding or infection may occur. If you have any concerns, discuss them with your healthcare provider before the test.
Interpreting ALT Test Results
Interpreting ALT test results requires considering various factors, including:
- The specific value of ALT
- Other liver function test results (e.g., AST, alkaline phosphatase)
- Patient’s medical history and risk factors
- Presence of symptoms
Here’s a general guide to interpreting ALT results:
- Normal: Within the laboratory’s reference range
- Mildly elevated: 1-3 times the upper limit of normal
- Moderately elevated: 3-20 times the upper limit of normal
- Severely elevated: More than 20 times the upper limit of normal
It’s crucial to remember that ALT levels should be interpreted in conjunction with other test results and clinical findings. Your healthcare provider will consider all these factors when assessing your liver health.
ALT to AST Ratio: What Does It Mean?
The ratio of ALT to AST (another liver enzyme) can provide additional insights into liver health:
- ALT:AST ratio < 1: May indicate advanced liver disease or alcohol-related liver damage
- ALT:AST ratio > 1: Often seen in viral hepatitis or drug-induced liver injury
However, this ratio should be interpreted cautiously and in conjunction with other clinical information.
Managing Abnormal ALT Levels
If your ALT levels are abnormal, your healthcare provider may recommend various strategies to address the underlying cause:
- Lifestyle modifications:
- Reducing alcohol consumption
- Maintaining a healthy weight
- Following a balanced diet
- Increasing physical activity
- Medication adjustments:
- Changing dosages or switching medications if drug-induced liver injury is suspected
- Treatment of underlying conditions:
- Antiviral therapy for viral hepatitis
- Management of autoimmune liver diseases
- Regular monitoring:
- Periodic ALT tests to track progress and response to interventions
The specific management plan will depend on the cause and severity of the liver condition. Always follow your healthcare provider’s recommendations and attend follow-up appointments as scheduled.
Preventing Liver Damage and Maintaining Healthy ALT Levels
While some causes of elevated ALT levels are beyond our control, there are several steps you can take to promote liver health and maintain normal ALT levels:
- Limit alcohol consumption
- Maintain a healthy weight
- Exercise regularly
- Follow a balanced, nutritious diet
- Avoid unnecessary medications and supplements
- Get vaccinated against hepatitis A and B
- Practice safe sex and avoid sharing needles
- Manage chronic conditions like diabetes and high cholesterol
By adopting these healthy habits, you can reduce your risk of liver damage and help maintain normal ALT levels.
Liver-Friendly Foods to Include in Your Diet
Certain foods may support liver health and help maintain normal ALT levels:
- Leafy green vegetables (e.g., spinach, kale)
- Cruciferous vegetables (e.g., broccoli, Brussels sprouts)
- Berries (e.g., blueberries, strawberries)
- Fatty fish rich in omega-3 fatty acids
- Nuts and seeds
- Green tea
- Turmeric
- Garlic
Incorporating these foods into your diet, along with a balanced intake of other nutrients, can contribute to overall liver health.
When to Seek Medical Attention for ALT Concerns
While routine ALT testing is often part of regular health check-ups, there are certain situations where you should seek immediate medical attention:
- Sudden onset of jaundice
- Severe abdominal pain
- Persistent nausea and vomiting
- Signs of internal bleeding (e.g., black, tarry stools)
- Confusion or altered mental state
- Rapid weight gain and swelling
These symptoms may indicate serious liver problems that require prompt medical evaluation and treatment. Don’t hesitate to contact your healthcare provider if you experience any of these symptoms or have concerns about your liver health.
Follow-up Testing and Monitoring
If your ALT levels are abnormal, your healthcare provider may recommend additional tests to further evaluate your liver health:
- Complete liver function panel
- Hepatitis virus testing
- Imaging studies (e.g., ultrasound, CT scan, or MRI)
- Liver biopsy (in some cases)
Regular monitoring of ALT levels and other liver function tests may be necessary to track the progression of liver conditions or the effectiveness of treatments.
The Future of Liver Health Monitoring: Beyond ALT
While ALT remains a crucial marker for liver health, research is ongoing to develop more advanced and comprehensive methods of assessing liver function:
- Non-invasive imaging techniques: Advanced ultrasound and MRI technologies are being developed to assess liver fibrosis and steatosis without the need for biopsies.
- Biomarker panels: Researchers are exploring combinations of blood-based biomarkers that may provide more accurate and early detection of liver diseases.
- Genetic testing: Identifying genetic variants associated with increased risk of liver diseases may help in early intervention and personalized treatment strategies.
- Artificial intelligence: Machine learning algorithms are being developed to analyze complex data sets and improve the accuracy of liver disease diagnosis and prognosis.
These advancements hold promise for more precise and personalized liver health assessments in the future. However, ALT testing remains a valuable and widely used tool in current clinical practice.
The Role of Patient Education in Liver Health
Empowering patients with knowledge about ALT levels and liver health is crucial for prevention and early intervention. Healthcare providers play a vital role in:
- Explaining the significance of ALT test results
- Discussing lifestyle modifications to promote liver health
- Addressing patient concerns and questions about liver function
- Providing resources for further education on liver health
By fostering open communication and patient education, healthcare providers can help individuals take a proactive approach to maintaining healthy ALT levels and overall liver function.
Alanine Aminotransferase (ALT) Test | HealthLink BC
Topic Contents
- Test Overview
- Why It Is Done
- How To Prepare
- How It Is Done
- How It Feels
- Risks
- Results
- Related Information
- Credits
Test Overview
An alanine aminotransferase (ALT) test measures the amount of this enzyme in the blood. ALT is found mainly in the liver, but also in smaller amounts in the kidneys, heart, muscles, and pancreas. ALT was formerly called serum glutamic pyruvic transaminase (SGPT).
ALT is measured to see if the liver is damaged or diseased. Low levels of ALT are normally found in the blood. But when the liver is damaged or diseased, it releases ALT into the bloodstream, which makes ALT levels go up. Most increases in ALT levels are caused by liver damage.
The ALT test is often done along with other tests that check for liver damage, including aspartate aminotransferase (AST), alkaline phosphatase, lactate dehydrogenase (LDH), and bilirubin. Both ALT and AST levels are reliable tests for liver damage.
Why It Is Done
The ALT test is done to:
- Identify liver disease, such as cirrhosis and hepatitis, caused by alcohol, drugs, or viruses.
- Help check for liver damage.
- Find out whether jaundice was caused by a blood disorder or liver disease.
- Keep track of the effects of cholesterol-lowering medicines and other medicines that can damage the liver.
How To Prepare
Avoid strenuous exercise just before having an ALT test.
How It Is Done
A health professional uses a needle to take a blood sample, usually from the arm.
How long the test
takes
The test will take a few minutes.
How It Feels
When a blood sample is taken, you may feel nothing at all from the needle. Or you might feel a quick sting or pinch.
Risks
There is very little chance of having a problem from this test. When a blood sample is taken, a small bruise may form at the site.
Results
Each lab has a different range for what’s normal. Your lab report should show the range that your lab uses for each test. The normal range is just a guide. Your doctor will also look at your results based on your age, health, and other factors. A value that isn’t in the normal range may still be normal for you.
Results are usually available within 12 hours.
High values
High levels of ALT may be caused by:
- Liver damage from conditions such as hepatitis or cirrhosis.
- Lead poisoning.
- Very strenuous exercise or severe injury to a muscle.
- Exposure to carbon tetrachloride.
- Decay of a large tumour (necrosis).
- Many medicines, such as statins, antibiotics, chemotherapy, aspirin, opioids, and barbiturates.
- Mononucleosis.
- Growth spurts, especially in young children. Rapid growth can cause mildly elevated levels of ALT.
Credits
- Kidneys
- Pancreas
About This Page
General Feedback
Email Link
Physical Activity Services
We appreciate your feedback. Comments submitted through the form below can help us fix errors in page content, get rid of interface bugs, and update the HealthLinkBC website to better suit the needs of the people who use it.
To submit feedback about this web page, please enter your comments, suggestions, compliments or questions in the form below. To submit general feedback about the HealthLink BC website, please click on the General Feedback tab.
Page
Content
Functionality
Message:
Your name:
Your email:
To submit general feedback about the HealthLink BC website, please enter your comments, suggestions, compliments or questions in the form below. To submit feedback about a specific web page, please click on the About This Page tab.
Please note that we are unable to provide general health information or advice about symptoms by email. For general health information or symptom advice, please call us at 8-1-1 any time of the day or night.
For questions about food and nutrition, please click on Email a HealthLinkBC Dietitian.
What is your message about?
— Select –8-1-1 Telephone ServicesBC Health Service Locater AppBrand Name Food List (BNFL)Website ContentTechnical ProblemsPrint media requirements / Web buttonsOther
Message:
Your name:
Your email:
Alanine aminotransferase (ALT) Test
Also Known As
Serum glutamic-pyruvic transaminase (SGPT)
At a Glance
Why Get Tested?
To screen for liver disease
When To Get Tested?
If your doctor thinks that you have symptoms of a liver disorder
Sample Required?
A blood sample will be taken from a vein in the arm
Test Preparation Needed?
No test preparation is needed, although you should inform your doctor about any drugs you are taking
On average it takes 7 working days for the blood test results to come back from the hospital, depending on the exact tests requested. Some specialist test results may take longer, if samples have to be sent to a reference (specialist) laboratory. The X-ray & scan results may take longer. If you are registered to use the online services of your local practice, you may be able to access your results online. Your GP practice will be able to provide specific details.
If the doctor wants to see you about the result(s), you will be offered an appointment. If you are concerned about your test results, you will need to arrange an appointment with your doctor so that all relevant information including age, ethnicity, health history, signs and symptoms, laboratory and other procedures (radiology, endoscopy, etc.), can be considered.
Lab Tests Online-UK is an educational website designed to provide patients and carers with information on laboratory tests used in medical care. We are not a laboratory and are unable to comment on an individual’s health and treatment.
Reference ranges are dependent on many factors, including patient age, sex, sample population, and test method, and numeric test results can have different meanings in different laboratories.
For these reasons, you will not find reference ranges for the majority of tests described on this web site. The lab report containing your test results should include the relevant reference range for your test(s). Please consult your doctor or the laboratory that performed the test(s) to obtain the reference range if you do not have the lab report.
For more information on reference ranges, please read Reference Ranges and What They Mean.
What is being tested?
ALT is an enzyme found mostly in the liver; smaller amounts are also found in the kidneys, heart and muscles. When the liver is damaged, ALT is released into the bloodstream, hence increasing the concentration that can be detected in a blood test. This often happens before more obvious symptoms of liver damage occur, such as jaundice (yellowing of the eyes and skin).
See More
See Less
Accordion Title
Common Questions
How is it used?
The ALT blood test detects liver injury. ALT results are usually assessed alongside the results of other blood tests such as alkaline phosphatase (ALP), gamma-glutamyl transferase (GGT) and aspartate aminotransferase (AST) to help determine which form of liver disease is present.
When is it requested?
A doctor usually requests an ALT test with other laboratory investigations to evaluate a patient who has symptoms of a liver disorder. ALT is used to identify liver damage. Some of these symptoms include jaundice, dark urine, nausea, vomiting, abdominal swelling, unusual weight gain and abdominal pain. ALT can also be used, either by itself or with other tests, for patients at risk of developing liver disease such as:
- persons who have a history of known or possible exposure to hepatitis viruses
- those who drink too much alcohol
- those whose family have a history of liver disease
- people who take drugs that might damage the liver
- those who are overweight or who have diabetes
In people with mild symptoms, such as tiredness or loss of energy, ALT may be tested to make sure they do not have chronic (long-term) liver disease. ALT is often used to monitor the treatment of persons who have liver disease to see if the treatment is working and may be requested either by itself or along with other blood tests.
What does the test result mean?
Very high concentrations of ALT (more than 10 times the highest normal level) are usually due to acute (short-term) hepatitis, often due to a viral infection. In acute hepatitis, the concentration of ALT usually stays high for about 1–2 months but can take as long as 3–6 months to return to normal.
ALT concentrations are usually not as high in chronic hepatitis, often less than 4 times the highest normal level. In this case, ALT concentrations often vary between normal and slightly increased, so doctors may request the test frequently to see if there is a pattern. A moderately high ALT can also occur when there is a high alcohol intake, diabetes or raised serum triglycerides, all of which can cause fatty liver.
In some liver diseases, especially when the bileducts are blocked (cholestasis), when a person has cirrhosisn or when liver cancer is present, the concentration of ALT may be close to normal.
Is there anything else I should know?
An injection of medicine into the muscle tissue or strenuous exercise may increase ALT concentration as it is released into the bloodstream from muscle.
Certain drugs may cause liver damage, resulting in high ALT concentrations. This occurs in a very small percentage of patients and is true of both prescription drugs and some ‘natural’ health products. If your doctor finds that you have a high ALT, tell them about all the drugs and health products you are taking.
What is hepatitis?
Hepatitis is an inflammation of the liver. There are two major forms: acute and chronic. Acute hepatitis is a fast-developing disease and typically makes affected persons feel sick, as if they have the flu, often with loss of appetite and sometimes diarrhoea and vomiting. In many cases, acute hepatitis causes dark urine, pale stools and yellowing of the skin and eyes (jaundice). Most affected individuals eventually recover completely. Chronic (long-term) hepatitis usually causes no symptoms or causes only loss of energy and tiredness; most people don’t know that they have it. In some people, chronic hepatitis can gradually damage the liver and, after many years, cause it to fail.
What are the other liver tests?
Other commonly used liver tests include other enzymes found in liver cells, such as aspartate aminotransferase (AST) and alkaline phosphatase as well as bilirubin, which is a breakdown product from red blood cells removed from the body by the liver and spleen. The doctor will often order these tests together as a group and refer to them as ‘liver function tests’. Albumin is also frequently included in the liver function test profile because, as albumin is produced by the liver, it can be used as a measure of liver protein synthesis. However, other factors can affect the concentration of albumin in the blood such as poor nutrition or excessive loss from the gut or kidney.
See More Common Questions
See Less Common Questions
Explaining p-Values for Beginner Data Scientists sigma” (meaning having a p-value of 0.0000003).
Back then, I didn’t know anything about p-value, hypothesis testing, or even statistical significance.
I decided to google the word “p-value” and what I found on Wikipedia made me even more confused…
When testing statistical hypotheses , the p-value of or , the probability value of for a given statistical model, is the probability that, when the null hypothesis is true, the statistical summary (for example, the absolute value of the sample mean difference between two compared groups) will be greater than or equal to actual observed results.
— Wikipedia
Good job, Wikipedia.
Okay. I didn’t understand what p-value actually means.
As I delved into the realm of data science, I finally began to understand the meaning of the p-value and where it can be used as part of the decision-making tools in certain experiments.
So I decided to explain the p-value in this article and how it can be used in hypothesis testing to give you a better and more intuitive understanding of p-values.
Also we can’t skip the fundamental understanding of other concepts and the definition of p-value, I promise I will make this explanation intuitive without exposing you to all the technical terms I’ve come across.
There are four sections in this article to give you a complete picture from building a hypothesis test to understanding the p-value and using it in decision making. I highly recommend that you go through all of them to get a detailed understanding of p-values:
- Hypothesis Test
- Normal distribution
- What is a P-value?
- Statistical significance
It will be fun.
Let’s get started!
1. Hypothesis Testing
Before we talk about what p-value means, let’s start by looking at hypothesis testing, where p-value is used to determine the statistical significance of our results.
Our final goal is to determine the statistical significance of our results.
And statistical significance is built on these 3 simple ideas:
- Hypothesis testing
- Normal distribution
- P-value
Hypothesis testing is used to test the validity of a statement (null hypothesis) made about a population using sample data. The alternative hypothesis is the one you would believe if the null hypothesis were wrong.
In other words, we will create a statement (null hypothesis) and use the sample data to check if the statement is valid. If the statement is not true, we will choose an alternative hypothesis. Everything is very simple.
To find out if a claim is valid or not, we will use the p-value to weight the strength of the evidence to see if it is statistically significant. If the evidence supports the alternative hypothesis, then we will reject the null hypothesis and accept the alternative hypothesis. This will be explained in the next section.
Let’s use an example to make this concept clearer, and this example will be used throughout this article for other concepts.
Example. Suppose a pizzeria claims to have an average delivery time of 30 minutes or less, but you think it’s longer than advertised. So you do a hypothesis test and randomly choose a delivery time to test the assertion:
- Null hypothesis – Average delivery time is 30 minutes or less
- Alternative hypothesis – average delivery time exceeds 30 minutes
- The goal here is to determine which statement, null or alternative, is better supported by the data obtained from our sample data.
We will use a one-sided test in our case, since we only care that the average delivery time exceeds 30 minutes. We will not consider this possibility in the other direction, since the consequences of an average delivery time of less than or equal to 30 minutes are even more preferable. Here we want to check if there is a chance that the average delivery time is greater than 30 minutes. In other words, we want to see if the pizzeria has cheated us.
One common way to test hypotheses is to use the Z-test. We won’t go into details here, as we want to get a better understanding of what’s going on on the surface before diving deeper.
2. Normal distribution
The normal distribution is a probability density function used to view the distribution of data.
The normal distribution has two parameters, the mean (μ) and the standard deviation, also called sigma (σ).
The mean is the central trend of the distribution. It specifies the location of the peak for normal distributions. Standard deviation is a measure of variability. It determines how far from the mean the values tend to fall.
The normal distribution is usually associated with the 68-95-99.7 rule (image above).
- 68% of data are within 1 standard deviation (σ) of the mean (μ)
- 95% of data are within 2 standard deviations (σ) of the mean (μ)
- 99. 7% of data are within 3 standard deviations (σ) of the mean (μ)
Remember the five sigma threshold for discovering the Higgs boson that I talked about at the beginning? 5 sigma is about 99.99999426696856% of the data that must be in before scientists confirm the discovery of the Higgs boson. This was a strict threshold set to avoid any possible false signals.
Cool. Now you might be wondering, “How does the normal distribution relate to our previous hypothesis testing?”
Since we used the Z-test to test our hypothesis, we need to calculate the Z-scores (to be used in our test statistics), which are the number of standard deviations from the mean of the data point. In our case, each data point is the pizza delivery time we got.
Note that when we calculated all Z-scores for each pizza delivery time and plotted the standard bell curve as shown below, the unit on the x-axis will change from minutes to standard deviation since we standardized the variable, subtracting the mean and dividing it by the standard deviation (see formula above).
Examining the standard bell curve is useful because we can compare test results with a “normal” population with a standardized unit in standard deviation, especially when we have a variable that comes with different units.
Z-score can tell
I like how Will Kersen put it: the higher or lower the Z-score, the less likely a random outcome will be, and the more likely a significant outcome will be.0007
But how high (or low) is considered convincing enough to quantify how meaningful our results are?
Climax
Here we need the last piece to solve the puzzle, the p-value, and check if our results are statistically significant based on the significance level (also known as alpha) that we set before starting our experiment.
3. What is a P-value?
Finally… Here we are talking about the p-value!
All of the previous explanations are meant to set the stage and get us to this P-value. We need the previous context and steps to understand this mysterious (actually not so mysterious) p-value and how it can lead to our hypothesis testing solutions.
If you’ve gotten this far, keep reading. Because this section is the most exciting part of all!
Instead of explaining p-values using Wikipedia’s definition (sorry Wikipedia), let’s explain it in our context – pizza delivery time!
Recall that we randomly sampled some pizza delivery times and the goal is to check if the delivery time is more than 30 minutes. If the definitive evidence supports the pizzeria’s claim (average delivery time is 30 minutes or less), then we will not reject the null hypothesis. Otherwise, we reject the null hypothesis.
So the task of the p-value is to answer this question:
If I live in a world where pizza delivery times are 30 minutes or less (null hypothesis true), how surprising is my evidence in real life?
The P-value answers this question with a number – a probability.
The lower the p value, the more surprising the evidence is, the more ridiculous our null hypothesis looks.
And what do we do when we feel ridiculous about our null hypothesis? We reject it and choose our alternative hypothesis.
If the p-value is below a given level of significance (people call it alpha, I call it the ridiculousness threshold – don’t ask me why, it’s just easier for me to understand), then we reject the null hypothesis.
Now we understand what the p-value means. Let’s apply this to our case.
P-value in pizza delivery time calculation
Now that we have collected some sample delivery time data, we ran the calculation and found that the average delivery time is 10 minutes longer with a p-value of 0.03.
This means that in a world where pizza delivery times are 30 minutes or less (the null hypothesis is true), there is a 3% chance that we will see average delivery times at least 10 minutes longer due to random noise.
The smaller the p-value, the more significant the result will be because it is less likely to be caused by noise.
In our case, most people misunderstand the p-value:
A p-value of 0.03 means there is a 3% (percent chance) that the result is due to chance – which is not true.
People often want a certain answer (myself included), which is why I’ve been confused about the interpretation of p-values for a long time.
The p-value *proves* nothing. It’s just a way of using surprise as the basis for making a smart decision.
— Cassie Kozyrkov
Here’s how we can use a p-value of 0.03 to help us make an intelligent decision (IMPORTANT):
- Imagine we live in a world where the average delivery time is always 30 minutes or less — because we believe in the pizzeria (our original belief)!
- After analyzing the delivery time of the collected samples, the p-value is 0.03 lower than the 0.05 significance level (assuming we set this value before our experiment), and we can say that the result is statistically significant.
- Since we have always believed the pizzeria that they can fulfill their promise to deliver pizza in 30 minutes or less, we now need to consider whether this belief makes sense, since the result tells us that the pizzeria does not fulfill its promise and the result is statistically significant .
- So what do we do? First, we try to think of every possible way to make our initial belief (the null hypothesis) true. But as the pizzeria slowly gets bad reviews from other people and often makes bad excuses that led to delivery delays, even we ourselves feel ridiculous to justify the pizzeria and hence we decide to reject the null hypothesis.
- Finally, the next smart decision is not to buy more pizza from this place.
By now, you may have figured something out… Depending on our context, p-values are not used to prove or justify anything.
In my opinion p-values are used as a tool to challenge our initial belief (the null hypothesis) when the result is statistically significant. The moment we feel ridiculous with our own belief (assuming the p-value indicates that the result is statistically significant), we discard our original belief (reject the null hypothesis) and make a reasonable decision.
4. Statistical significance
Finally, this is the last step where we put everything together and check if the result is statistically significant.
It’s not enough to just have a p-value, we need to set a threshold (significance level – alpha). Alpha should always be set before an experiment to avoid bias. If the observed p-value is lower than alpha, then we conclude that the result is statistically significant.
The rule of thumb is to set alpha to 0.05 or 0.01 (again, the value depends on your task).
As mentioned earlier, suppose we set alpha to 0.05 before we started the experiment, the result is statistically significant because the p-value of 0.03 is lower than alpha.
For reference, below are the main steps of the whole experiment:
- Formulate the null hypothesis
- Formulate an alternative hypothesis
- Define alpha value to use
- Find the Z-score associated with your alpha level
- Find test statistic using this formula
- If the value of the test statistic is less than the Z-score of the alpha level (or the p-value is less than the alpha value), reject the null hypothesis. Otherwise, do not reject the null hypothesis.
If you want to know more about statistical significance, feel free to check out this article – An Explanation of Statistical Significance written by Will Kersen.
Follow-up reflections
There’s a lot to digest here, right?
I can’t deny that p-values are inherently confusing to a lot of people and it took me quite some time to really understand and appreciate the meaning of p-values and how they can be applied within our process decision making as data scientists.
But don’t rely too much on p-values, as they only help a small part of the whole decision-making process.
I hope my explanation of p-values has become intuitive and helpful in your understanding of what p-values really mean and how they can be used to test your hypotheses.
Calculating p-values by itself is simple. The tricky part comes when we want to interpret p-values in hypothesis testing. I hope that now the difficult part becomes a little easier for you.
If you want to learn more about statistics, I strongly recommend that you read this book (which I’m currently reading!) – Practical Statistics for Data Scientists, specially written for data scientists to understand the fundamental concepts of statistics.
Find out the details of how to get a sought-after profession from scratch or Level Up in skills and salary by completing paid online courses SkillFactory:
- Data Science training from scratch (12 months)
- Analyst profession with any starting level (9 months)
- Machine Learning Course (12 weeks)
- Python for Web Development course (9 months)
- DevOps course (12 months)
- Profession Web developer (8 months)
Read more
- Data Science Trends 2020
- Data Science is dead. Long live Business Science
- Cool Data Scientist don’t waste time on statistics
- How to become a Data Scientist without online courses
- 450 Free Ivy League Courses
- Data Science for the Humanities: What is “data”
- Data Science on steroids: an introduction to Decision Intelligence
What is a z-score? What is a p-value?—ArcGIS Pro
Most statistical tests begin with a null hypothesis. The null hypothesis for the structural pattern analysis tools (Structural Pattern Analysis and Cluster List toolset) is complete spatial randomness (SRC) or the objects themselves or the values associated with them. Z-scores and p-values obtained as a result of the analysis of structural patterns indicate whether the null hypothesis can be rejected or not. Typically, you run one of the structural pattern analysis tools with the assumption that the z-score and p-value will be indicative of a possible null hypothesis refutation. This will tell you that your features, or the values associated with them, exhibit statistically significant clustering or variance. Whenever you see spatial structure, such as landscape (or spatial data) clustering, you are seeing evidence of some basic spatial processes at work, and as a geographer or GIS analyst, this may interest you the most.
p-value is a probability. For structural pattern analysis, this is the probability that the observed spatial patterns were created by some random process. When the p-value is very small, it means that it is very unlikely (small probability) that the observed spatial patterns are the result of random processes, so the null hypothesis can be rejected. You may ask the question: How small is an object really small? Good question. See table and discussion below.
Z-scores are standard deviations. If, for example, the tool returns a z-score of +2.5, you would say the result is 2.5 standard deviations. Both z-scores and p-values are associated with the standard normal distribution, as shown below.
Very high or very low (negative) z-scores associated with very small p-values are at the tails of a normal distribution. When you run a structural pattern analysis tool and it results in small p-values or very high or very low z-scores, it indicates that the observed spatial pattern is unlikely to reflect the theoretical random structural pattern represented by your null hypothesis.
To reject the null hypothesis, you must make a subjective judgment about the level of risk you are willing to accept in order to be wrong (in order to falsely reject the null hypothesis). Therefore, before you run a spatial statistical process, you choose a confidence level. Typical confidence levels are 90, 95, or 99 percent. A 99 percent confidence level would be the most conservative in this case, indicating that you are not willing to reject the null hypothesis unless the probability that the model was generated by a random process is really small (less than a 1 percent chance).
Confidence levels
The table below shows the unadjusted critical p-values and z-scores for various confidence levels.
Tools that enable FDR will use adjusted critical p-values. These critical values will be the same or less than those shown in the table below.
z-score (standard deviations) | p-values (Probability) | Confidence level |
---|---|---|
< -1. 65 or > +1.65 | < 0.10 | 900 06 90% |
< -1.96 or > + 1.96 | < 0.05 | 95% |
< -2.58 or > +2.58 903 15 | < 0.01 | 99% |
Consider an example. Critical z-score values using 9The 5 percent confidence level is -1.96 and +1.96 standard deviations. The unadjusted p-value associated with the 95 percent confidence level is 0.05. If the z-score is between -1.96 and +1. 96, then the unadjusted p-value will be greater than 0.05, and you will not be able to reject the null hypothesis, as the pattern shown may likely be the result of random spatial processes. If the z-score falls outside that range (for example, -2.5 or +5.4 standard deviations), the observed spatial pattern is likely too unusual to be the result of a random process, and the p-values will be small to reject this. In this case, it is possible to reject the null hypothesis and proceed with finding out what might be causing the statistically significant spatial structure in your data.
The key idea here is that values in the middle of a normal distribution (z-scores such as 0.19 or -1.2, for example) represent the expected result. When the absolute value of the z-score is large and the probabilities are small (in the tails of the normal distribution), however, you see something unusual and generally very interesting. For the Hot Spot Analysis tool, for example, “unusual” means a statistically significant “hot” or “cold” spot.
FDR correction
Local spatial pattern analysis tools, including Hot Spot Analysis and Cluster and Outlier Analysis (Anselin Local Moran’s I) offer an additional option, Apply FDR correction. When this setting is enabled, FDR Correction lowers the critical p-value threshold shown in the table above for use in multiple testing and spatial dependency. The reduction, if any, is a function of the number of input objects and the environment structure used.
Local spatial pattern analysis tools work by looking at each feature in the context of surrounding features and determine if the local pattern (the target and its surroundings) is different from the global pattern (all features in the dataset). The results of the z-score calculations and p-values associated with each item allow you to determine whether the difference is statistically significant or not. This analytical approach creates certain difficulties in multiple testing and studying dependencies.
Multiple testing – with a 95 percent confidence level, probability theory says that there are only 5 chances in 100 that the spatial pattern can be structured (clustered or dispersive, for example) and can be associated with a statistically significant p-value when in fact, the spatial processes that create this pattern are random. In this case, we incorrectly reject the null hypothesis based on statistically significant p-values. Five chances out of 100 sounds pretty convincing until you realize that local spatial statistics performs a test on every feature in the dataset. For example, if there are 10,000 objects, we may get up to 500 erroneous results.
Spatial dependency – closely spaced objects tend to be similar; they are more likely than non-spatial data to show this type of dependency. However, many statistical tests require that the objects be independent. This is necessary for local pattern analysis tools because spatial dependence can artificially smooth out statistical significance. Spatial dependence is exacerbated by local pattern analysis tools, since each feature is evaluated in the context of its neighborhood, and closely spaced features will have many of the same neighborhoods. This coincidence emphasizes spatial dependence.
At least three 3 approaches are used to handle problems with multiple test and spatial dependencies. The first approach is to ignore the problem, given that the individual test performed on each object in the dataset must be considered separately from the others. However, with this approach, it is very likely that some statistically significant results will be incorrect (look statistically significant if the underlying spatial processes are random). The second approach is to apply the classic multiple testing procedure, such as the Bonferroni correction or the Sidac correction. However, these methods are usually too conservative. Although they significantly reduce the number of false positives, they also miss the available statistically significant results. A third approach is to apply an FDR correction that estimates the number of false positives for a given confidence level and adjusts the critical p-value accordingly. With this method, statistically significant p-values are ranked from the smallest (most stringent) to the largest (least stringent), based on the evaluation of false positives, the least stringent are removed from the list. The remaining features with statistically significant p-values are determined by the Gi_Bin or COType fields in the output feature class. While not ideal, this method has been shown to perform better in empirical tests than performing each test individually or using traditional, often overly conservative, multiple test methods. See the Additional Resources section for more information on FDR correction.
Null Hypothesis and Spatial Statistics
Some statistics tools in the Spatial Statistics Toolbox are inferred methods for spatial analysis of structural patterns, such as Spatial Autocorrelation (Global Moran’s I), Cluster and Outlier Analysis (Anselin Local Moran’s I), and Hot Spot Analysis. points (Getis-Ord Gi*). Logically derived statistical indicators are justified in the theory of probability. Probability is a measure of randomness, and underlying all statistical tests (either directly or indirectly) are probability calculations that evaluate the role of chance in the outcome of your analysis. Typically with traditional (non-spatial) statistics, you are working with a random sample and trying to determine the probability that your data sample is a good representation (reflexively) of the population as a whole. As an example, you might ask, “What are the chances that the results of my voter poll (showing Candidate A will slightly outperform Candidate B) reflect the final election results?” But in most cases, when working with spatial statistics, including the spatial autocorrelation mentioned above, you typically use all the data that is available in the study area (all crimes, all cases of illness, attributes for each census tract, and so on). When you calculate a statistic for the entire population, you no longer have an estimate at all. You have a fact. Therefore, there is no longer any point in talking about similarities or probabilities. So how can spatial pattern analysis tools, often applied to all data in a study area, legitimately report probabilities? The answer is that they can do this by postulating through the null hypothesis that the data, in fact, are part of some larger population. Let’s consider this in more detail.
Null hypothesis randomization – Where appropriate, the tools in the Spatial Statistics Toolbox use null hypothesis randomization as the basis for a statistical significance test. The null hypothesis randomization postulates that the observed spatial pattern of your data represents one of many (n!) possible spatial organization of the data. If you could collect data values and drop them on objects in your study area, you would have one possible spatial arrangement of those values. (Note that picking up your data values and randomly throwing them around is an example of a random spatial process.) The null hypothesis randomization states that if you could do this exercise (pick them up and drop them) an infinite number of times, in most cases you would create a structure that would not noticeably differ from the observed structure (your real data). Sometimes you might accidentally throw all the highest values into the same corner of your ROI, but the chances of this happening are small. The null hypothesis randomization states that your data is one of many, many, many possible versions of complete spatial randomness. The data values are fixed; only their spatial organization could change.
Null Hypothesis Normalization – The general alternative null hypothesis not implemented for the Spatial Statistics Toolkit is the null hypothesis normalization. Normalization of the Null Hypothesis postulates that the observed values are obtained from an infinitely large, normally distributed population through some random sampling process. With different samples, you would get different values, but you would still expect those values to be representative of the larger distribution. The null hypothesis normalization states that the values represent one of many possible samples of values. If you could fit your observed data to a normal curve and randomly sample values from that distribution to throw at your area of study, most of the time you will produce a pattern and distribution of values that are not noticeably different from the observed sample/distribution (your real data).